Acrobat Pro OCR ends up with different fonts to system fonts.

Question

I have scanned a document into a pdf file. I then use Acrobat Pro Edit PDF tool, to recognise the text in the document.

It finds all the text in the document, but the fonts it uses/embeds have a number at the end of the font name.

So for example, text that is Times Roman font, will be Times Roman-13009 or Times Roman 13009 as the font name in Edit PDF font section.

What then happens, is that when I go to edit the text, such as delete or change a word, the new text doesn't match and looks different.

What is causing this to happen, why does the Recognise text function make the font name with a number after it?

Under Document Properties/Font, it says the font is embedded, which I am assuming it says that because my system does have standard Tines Roman font installed. But having the number after it in the font name is like its a different font.

Does anyone know about this issue and how to fix it /prevent it from happening?

Thankyou..

Brad @ Roaring Mouse · Accepted Answer

This is under "How things work" .When Acrobat does OCR on a scanned document, unless you set it to use a system font (see below), it will create a fake font outline that looks as close to the scanned version as possible. You can see this if you zoom in on the letters. It then subsets that new fake font and gives it a name; and there may be several. It may not even be the font in the scan. That is merely there so that the text is accessible, say if you export the text to Word, it has a real font to connect to it, and it tends to be something simple like Times, Arial or Helvetica.Like any embedded subset font, there are only the characters USED in the subset. As soon as you type any character not already used, it will need to use a system font for those new characters. Like any PDF, you really should not be doing any extensive editing in it anyway. This is not the right place.Now, you can CHANGE the font by selecting all text in a paragraph and change it to REAL Times (or any font) you have that's close, and then you can edit more successfully, and the resulting file will then embed and subset that font instead. OR, you can tell Acrobat to use a system font when to recognizes text, like so: This will use a common font (like Times) but it won't likely match the scan.

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded