Skip to main content
Participating Frequently
May 30, 2017
Question

OCR not embedding fonts

  • May 30, 2017
  • 1 reply
  • 1118 views

The task: Scan a document, bring into Acrobat, run OCR, output as PDF, open and edit text in Illustrator.

No matter what I try, all I get is a bitmapped image. When looking at the Acrobat file, it shows the fonts used, but nothing is embedded and I cannot find a way to force Acrobat to embed the fonts.

This topic has been closed for replies.

1 reply

Dov Isaacs
Legend
May 30, 2017

The text from OCR is “hidden” and as such there is no good reason to embed the font. (The results of the OCR action are for search purposes, etc.) What is displayed is what was scanned such that you see the realistic view of your original document.

Conceivably, you could use Acrobat DC Preflight to force the fonts to be embedded, but such embedding of fonts for hidden text isn't going to buy you anything.

With regards to Illustrator, remember that Adobe Illustrator is not, repeat not, repeat yet again not a general purpose PDF file editor. Illustrator only supports a subset of the PDF imaging model and alas, that hidden OCR text is not supported for conversion purposes.

If you are simply trying to extract the OCR'ed text, try exporting the PDF to Microsoft Word.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)