Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

OCR Software for renderable text

Explorer ,
Nov 21, 2022 Nov 21, 2022

Hello,

 

I know that Acrobat can't OCR pages that have renderable text already.  I also know the workarounds like printing to PDF first as an image and then running OCR again.  However, I work with very large PDF's (thousands of pages) and multiple of this type.  To have to print to PDF first and then run OCR on them would take a very long time.  I'm hoping I could skip the step of having to Print to PDF first as an image.

I tried sanitizing the documents but after the sanitization, it still can't OCR some of these pages because I think some fonts are missing (these are not our documents) and although Acrobat displays these missing fonts correctly, they're read as gibberish (confirmed when trying to copy and paste).

 

I found a software called PDF Pro at pdfpro.com that seems to OCR the pages with these weird gibberish renderable text and correct them to what they should be.  So this prevents the step of having to print to PDF.

 

This made me wonder if there were other applications that can OCR renderable text pages.  If I have to spend money, I might as well get one that comes highly recommended but I'm not versed in this field.  Can anyone suggest any other software that I can demo that may be faster/better/have more options?

 

Could someone also explain why Acrobat doesn't replace renderable text during OCR while other apps can?  Not sure how the other apps are accomplishing this and I'm curious to the tech behind it.  Is there a reason Acrobat doens't want to implement such a feature?  

 

Thanks.

 

 

TOPICS
Edit and convert PDFs , General troubleshooting , How to
585
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
no replies

Have something to add?

Join the conversation