Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

PDF export to Word messing up Lao and Khmer text

New Here ,
Aug 26, 2025 Aug 26, 2025

When exporting Lao and Khmer PDFs to Word, the ligatures appear to break and the text becomes unreadable. Here's an example:

 

PDF:

JMHCA_0-1756254704346.png

 

Exported Word doc:

JMHCA_1-1756255065697.png

 

This happens in all fonts for these languages. I've been able to find virtually nothing about the cause of this online, except perhaps that the ToUnicode map (whatever that is) isn't being embedded in the PDF when it's exported from InDesign. For reasons I won't get into here, I absolutely have to have these documents in Word, as well as being fully accessible PDF forms with matching layouts. I'm grateful to hear from anyone who's experienced anything like this.

TOPICS
Edit and convert PDFs , PDF , PDF forms
172
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
1 ACCEPTED SOLUTION
Community Expert ,
Aug 27, 2025 Aug 27, 2025

@JMHCA the gibberish," is a direct result of the ToUnicode map being either missing or incomplete. Think of a PDF as a book of pictures, not a book of words. When you save a document as a PDF, the program takes a picture of each letter. It gives each picture a secret code, like "picture-1," "picture-2," and so on. The ToUnicode map is the key that translates these secret codes back into real letters. For a simple language like English, this is easy. "picture-1" is "A," "picture-2" is "B," and so on.

But for complex languages like Lao and Khmer, with their special characters and how letters join together, the program often forgets to include this key. When you try to convert the PDF to a Word document, the converter sees the secret codes but doesn't have the key to translate them. It tries to guess, but because it doesn't know what "picture-107" actually is, it just puts out a bunch of random symbols. That's why your text looks like a mess—the converter is flying blind. 

In your case, your PDFs were created in InDesign without the "ToUnicode map" feature enabled or correctly embedded. This is a common oversight, as it makes the PDF file size smaller, but it effectively makes the text "un-copyable" and "un-convertible." If by chance do you have the InDesign files that would be the most easiest — When exporting, go to File > Adobe PDF Presets. Choosing a preset like "High Quality Print" or "Press Quality" will almost always embed the necessary font information, including the ToUnicode map, for commercial printing. Also, to guarantee the document is fully accessible and searchable, export it as a PDF/A file. Go to File > Export, and in the dialog box, select the PDF/A standard you want to use (such as PDF/A-1a). This standard specifically requires that all fonts are fully embedded and that character mappings to Unicode are present, which will prevent the text scrambling issue.

View solution in original post

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 27, 2025 Aug 27, 2025

@JMHCA the gibberish," is a direct result of the ToUnicode map being either missing or incomplete. Think of a PDF as a book of pictures, not a book of words. When you save a document as a PDF, the program takes a picture of each letter. It gives each picture a secret code, like "picture-1," "picture-2," and so on. The ToUnicode map is the key that translates these secret codes back into real letters. For a simple language like English, this is easy. "picture-1" is "A," "picture-2" is "B," and so on.

But for complex languages like Lao and Khmer, with their special characters and how letters join together, the program often forgets to include this key. When you try to convert the PDF to a Word document, the converter sees the secret codes but doesn't have the key to translate them. It tries to guess, but because it doesn't know what "picture-107" actually is, it just puts out a bunch of random symbols. That's why your text looks like a mess—the converter is flying blind. 

In your case, your PDFs were created in InDesign without the "ToUnicode map" feature enabled or correctly embedded. This is a common oversight, as it makes the PDF file size smaller, but it effectively makes the text "un-copyable" and "un-convertible." If by chance do you have the InDesign files that would be the most easiest — When exporting, go to File > Adobe PDF Presets. Choosing a preset like "High Quality Print" or "Press Quality" will almost always embed the necessary font information, including the ToUnicode map, for commercial printing. Also, to guarantee the document is fully accessible and searchable, export it as a PDF/A file. Go to File > Export, and in the dialog box, select the PDF/A standard you want to use (such as PDF/A-1a). This standard specifically requires that all fonts are fully embedded and that character mappings to Unicode are present, which will prevent the text scrambling issue.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 27, 2025 Aug 27, 2025

Thank you so much! If this works, you'll be a lifesaver. One thing: when I go to export, the PDF/A standard is unavailable—I'm only seeing the PDF/X option. Do I need to adjust another setting to get the PDF/A standard?

JMHCA_0-1756317143041.png

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 04, 2025 Sep 04, 2025

Learned from Adobe Support that the only way to generate a PDF/A from InDesign is with the Adobe PDF printer. Hopefully that helps anyone else who has this issue!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Sep 07, 2025 Sep 07, 2025
LATEST

Hello @JMHCA

 

Thank you for sharing the steps. For future reference, you can check these Adobe articles for the steps: 

PDF/X-, PDF/A-, and PDF/E-compliant files (Acrobat Pro).

How to convert a PDF to a PDF/A.

How to export InDesign Book (indb) to PDF/A.

 

Thanks,

Anand Sri | Acrobat Community Team

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines