Skip to main content
Participant
July 15, 2016
Question

How can I convert JPG images of a periodical to PDFs that are searchable?

  • July 15, 2016
  • 2 replies
  • 482 views

I have tried using enhance scan and then recognize text but the result does not give me anything that is searchable.  Also when I try to copy and paste the text from this resulting file I get gibberish. Therefore I assume the resulting file does not have recognizable text. Answers to this question will determine whether I purchase this product or not after the 30 day trial.  I have about 10,000 JPG files to convert to try and make searchable and usable.

This topic has been closed for replies.

2 replies

tracy8872
Participant
August 5, 2016

I use some third-party converters for it...

Legend
August 5, 2016

Then, you need to examine how to make your third party app work for you.

1. JPEG is not suitable. Use TIFF. Or better still, scan to PDF, any decent volume software should do this.

2. Look at the resolution of your third party app. Many people just scan with default settings, but these are rarely right or useful.

3. Acrobat is a desktop product for very low volume automation. 10,000 pages is very optimistic and may cause much pain, no matter how good Acrobat's OCR.

4. OCR tools typically include the option to verify and correct all text. This is a very time consuming process for 10,000 pages - maybe a month's non-stop work if you can proof read and correct in a minute per page - but is the only path to perfection. Such a large job is frequently outsourced to a country where labour is cheaper.

Bernd Alheit
Community Expert
Community Expert
July 16, 2016

May be that the resolution of the scans are too low.

When you scan as image use TIFF format with at least 300 dpi.