Copy link to clipboard
Copied
I have a large PDF of an old book that I'm trying to convert to text in order, ultimately, to create an ebook. The print-ready PDF has been supplied by the printer but we don't have access to the original InDesign (or whatever software was used) files of the layout some 20-odd years ago. The PDF file is essentially just page images, so the text needs to be freshly OCRed, so I'm trialling the latest Adobe Acrobat DC for this purpose.
OCR seems to work quite well on the text that Acrobat recognises, but it is passing off large slabs of text. The image below shows what I mean;
Is there a way I can force Acrobat to OCR regions not automatically identified as text?
Copy link to clipboard
Copied
Please try the different option of OCR. Go to Tools> Enhance Scans> Recognize Text> In this file> Recognize Text
It should recognize Text properly. But it won't allow you to do any Editing. But you can correct any text it recognized incorrectly using "correct recognize text" option in drop down.
Thanks.
Copy link to clipboard
Copied
I've a question. Can I automate OCR to search within folders of scanned documents and images without converting the image or scanned document to editable?
Copy link to clipboard
Copied
Not exactly, but you can do it using Advance search functionality.
Instead of selecting "Editable Text & Images", select "Searchable Image(Exact)". It will not make PDFs editable but add a text layer on images or scanned documents. Also, you can save these documents as a copy of original instead of making changes in original.
I hope it will resolve your issue.
Thanks.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now