Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Acrobat Pro OCR ends up with different fonts to system fonts.

New Here ,
Sep 29, 2024 Sep 29, 2024

I have scanned a document into a pdf file.  I then use Acrobat Pro Edit PDF tool, to recognise the text in the document.

It finds all the text in the document, but the fonts it uses/embeds  have a number at the end of the font name.

So for example,  text that is Times Roman font, will be Times Roman-13009 or Times Roman 13009 as the font name in Edit PDF font section.

What then happens, is that when I go to edit the text, such as delete or change a word, the new text doesn't match and looks different.

What is causing this to happen, why does the Recognise text function make the font name with a number after it?

Under Document Properties/Font, it  says  the font is embedded, which I am assuming it says that because my system does have standard Tines  Roman font installed. But having the number after it in the font name is like its a different font.

 

Does anyone know about this issue and how to fix it /prevent it from happening?  

 

Thankyou..

TOPICS
Edit and convert PDFs , PDF , Scan documents and OCR
2.6K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
1 ACCEPTED SOLUTION
Community Expert ,
Sep 29, 2024 Sep 29, 2024

This is under "How things work" .

When Acrobat does OCR on a scanned document, unless you set it to use a system font (see below), it will create a fake font outline that looks as close to the scanned version as possible. You can see this if you zoom in on the letters. It then subsets that new fake font and gives it a name; and there may be several. It may not even be the font in the scan. That is merely there so that the text is accessible, say if you export the text to Word, it has a real font to connect to it, and it tends to be something simple like Times, Arial or Helvetica.

Like any embedded subset font, there are only the characters USED in the subset. As soon as you type any character not already used, it will need to use a system font for those new characters. Like any PDF, you really should not be doing any extensive editing in it anyway. This is not the right place.

Now, you can CHANGE the font by selecting all text in a paragraph and change it to REAL Times (or any font) you have that's close, and then you can edit more successfully, and the resulting file will then embed and subset that font instead.

 

OR, you can tell Acrobat to use a system font when to recognizes text, like so:

 

Screen Shot 2024-09-30 at 12.09.28 AM.png

This will use a common font (like Times) but it won't likely match the scan.

View solution in original post

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 29, 2024 Sep 29, 2024

This is under "How things work" .

When Acrobat does OCR on a scanned document, unless you set it to use a system font (see below), it will create a fake font outline that looks as close to the scanned version as possible. You can see this if you zoom in on the letters. It then subsets that new fake font and gives it a name; and there may be several. It may not even be the font in the scan. That is merely there so that the text is accessible, say if you export the text to Word, it has a real font to connect to it, and it tends to be something simple like Times, Arial or Helvetica.

Like any embedded subset font, there are only the characters USED in the subset. As soon as you type any character not already used, it will need to use a system font for those new characters. Like any PDF, you really should not be doing any extensive editing in it anyway. This is not the right place.

Now, you can CHANGE the font by selecting all text in a paragraph and change it to REAL Times (or any font) you have that's close, and then you can edit more successfully, and the resulting file will then embed and subset that font instead.

 

OR, you can tell Acrobat to use a system font when to recognizes text, like so:

 

Screen Shot 2024-09-30 at 12.09.28 AM.png

This will use a common font (like Times) but it won't likely match the scan.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 30, 2024 Sep 30, 2024

Thankyou fo\r the detailed explanation, that has helped me solve my isssue.  I used the "Available System Font" option in Recognize text and the fonts are now using just the font name,  no numbers and edited text matches.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 30, 2024 Sep 30, 2024

Good to hear!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 01, 2024 Oct 01, 2024

Wondering if you might know how to do this - As you mentioned above,  about selecting the text in a paragraph and then changing the font, what about if you want to change the font for multiple text on the page, but each text paragraph is in different boxes.   

 

Is there a way to select more than one text box so as to change font for multiple boxes?

Also is there a quick way to select multiple text boxes at once, instead of one by one ?

Thanks.

 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 02, 2024 Oct 02, 2024

"Is there a way to select more than one text box so as to change font for multiple boxes?"

I don't think so.

I think you might need to rethink your approach. Rather than just trying to make a scanned PDF usable, I would just use it to export the recognized text (say as a Word file) and use that to rebuild a proper file, be it in InDesign or otherwise.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 06, 2024 Dec 06, 2024

Hello! Thank you for this comment. I am having a similar issue to the original poster, but am not able to find the "Use available system font" option. Where is this option found? Has it been removed from Acrobat? 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 06, 2024 Dec 06, 2024

Depending on what view of Acrobat you have  ("New" or "old") it's in of of these two spots:

Screen Shot 2024-12-06 at 4.57.51 PM.png

 

Screen Shot 2024-12-06 at 5.01.13 PM.png

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 02, 2025 Mar 02, 2025
LATEST

Thank you so much, Brad, for explaining where I can switch off the replacement with system fonts in the “new” layout. I was almost at my wits' end because the typeface was completely distorted every time I edited it.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines