Skip to main content
Participant
November 7, 2024
Question

Export Chinese traditional and simplified documents from InDesign

  • November 7, 2024
  • 2 replies
  • 1631 views

I am primarily a developer with limited knowledge of InDesign but I have run across a problem that I have attempted to solve in multiple ways but have yet to figure out.  We have documents in four languages.  The English and Spanish documents accurately export to HTML and PDF.  But when it comes to Chinese text, I have some issues:

 

1)  If I export a document in simplified Chinese to PDF, the text appears correctly until I copy and paste it, it then reverts to traditional Chinese when pasted to a new dopcument.

2) If I export the document to HTML, it only displays as traditional chinese.

 

This seems like a font issue.  The traditional chinese documents uses a font called MHeiHK and the simplified uses a font called MHeiHKS.  Those fonts are referenced in the InDesign document and appear in the css of the respective documents: MHeiHK in traditional chinese and MHeiHKS in simplified chinese.  I have used both legacy HTML format and the newer HTML5 format.  The same issue appears in both export types.  In fact, when I open the exported document in an editor like Visual Studio Code, all the symbols appear to be in traditional chinese.  Changing the css doesn't change the font (I wouldn't expect it to...)

So my question is:  Is there a way to make sure the simplified chinese fonts used in InDesign export properly to both PDF and HTML?  Our translation team mentioned something about the "base" font being traditional chinese but the simplified fonts somehow "overlay" or "modify" the TC base font to display as simplified.  I don't know enough about fonts to know exactly what they meant by that.

We have hundreds of documents with this issue and we'd like to move more of them to HTML format but this has been a show-stopper for a few years.

Thanks in advance for any help or direction you can give me.

This topic has been closed for replies.

2 replies

Willi Adelberger
Community Expert
Community Expert
November 7, 2024

Do you have the InDesign version for Asian languages installed?

Participant
November 8, 2024

No, I don't.  Is that a requirement to be able to export the content correctly?  The TC export works fine.  It's just a mystery as to why the simplified Chinese does not..

Participant
November 11, 2024

Well, if you're using the HTML5 export, I promise you it's not working fine. I'd never used HTML5 export before this morning, but when I went and looked at the output, the language declaration in the header was identical to the language setting in InDesign. 

That text is showing up as "Arabic" because that's my default language in new documents. Because it's set to Arabic, if I go and look at the header in the exported HTML, it's declared as "ar-sa":

 

To get the language ID correct, you have I think only two options:

1) Use a version of InDesign that has your intended language in the dropdown on the Character panel

2) Edit the HTML post-export to have the correct lang declaration

 

 


Thanks Joel,

 

Still sorting through this.  Here is a snippet from the InDesign document which displays the simplified chinese correctly:


And, here is a snippet from the HTML exported:

Nothing modifed. Straight export.  You can see that the exported document is in the traditional character set, rather than simplified.  So, following your lead, I went to the HTML source.  I see this:

And, yes, the lang attribute is set to zh-TW rather than zh-CN.  And, based on the characters I see in the source, I cannot just change the lang attribute and get simplified.

Its hard to see in the first snippet but the font is set to MHeiHKS which is the simplified chinese font, but the CSS references only carry the traditional font references (MHeiHK) so I guess I am surprised that WYSIWYG doesn't follow through the export.   

I will try changing the actual language, if I can, and give the export another try.  Thanks for the pointers.







Joel Cherney
Community Expert
Community Expert
November 7, 2024

Well, when you export PDFs, you should be embedding the fonts you use into the PDF at export. So Acrobat (or a third-party PDF viewer) will display that text using the font that was used in InDesign and embedded at export. If you're copying text from the PDF and pasting it somewhere else, you're most likely getting raw Unicode text on your clipboard, and the place where you are pasting it is where your default Trad Chinese settings are going to be found. 

 

Now, I don't export HTML from InDesign, ever, but I'd expect that whatever you export will essentially be text with tags, right? When you open that HTML you've exported from InDesign, it's simply displaying in whatever default HTML viewer you have installed on the machine in question. If you render well-formatted HTML in a browser, it should render correctly... assuming you've declared the language in the header, right? Once again, I don't have any idea what InDesign's current HTML export capabilities might be; I don't know if you just need to tweak the export settings somehow, or if you have to go in post-export and fix the language declarations. The only things I ever do anything that might be vaguely relevant would be to export XML for ingestion into a client's document management system, and in some of those cases I do apply XSL post-export to ensure good language tagging. 

 

I wouldn't trust font name declarations in CSS to correctly identify the language code. You'd want to specify something like 

lang="zh-CN"

to force Simplified Chinese.

 

This post here barely scrapes the surface, but I feel confident that you can figure out how to handle your multilingual document management if you ask the right questions, or if we poke you into asking the right questions.