Skip to main content
Known Participant
August 15, 2023
Answered

How to OCR a combined file of scanned text without reducing size of text?

  • August 15, 2023
  • 2 replies
  • 982 views

I scan a bunch of text in large images (each image has ~1500 words)
I combine the images in acrobat. - They are very high resolution and easy to read. 


I scan/ocr, then recognize the text. 
When I am done, it has reduced the size of the images by 1/4 - making them low resolution and difficult to read as images in a pdf. 

(note - the OCR is fine, it is the actual image of the text that is the problem). 

How to do keep the original size/resolution when I run the OCR - Regognize text?

This topic has been closed for replies.
Correct answer Bernd Alheit

As output try "Searchable Image (Exact)"

2 replies

Bernd Alheit
Community Expert
Bernd AlheitCommunity ExpertCorrect answer
Community Expert
August 16, 2023

As output try "Searchable Image (Exact)"

Known Participant
August 17, 2023

That did it! Thanks. 
I saw that option but didn't consider that it would solve the problem. I thought 600dpi was enough! 🙂

 

Bernd Alheit
Community Expert
Community Expert
August 15, 2023

What OCR options does you use? 

Known Participant
August 15, 2023

downsample to 600dpi.

The original files are
11 inches by 7 inches at 300dpi
or 44 inches by 28 inches at 72 dpi.

I don't know what they are after conversion - but this is what it looks like before and after at 100%
the pdf says the file is 11 x 7

before 
https://www.dropbox.com/scl/fi/oqf6166d7gz4cww8nw6e8/original-file-size.png?rlkey=e4pncwy4qfix6acow3aw70yx1&dl=0

after
https://www.dropbox.com/scl/fi/tftbdppbnladq2kwdrddc/after-ocr-file-size.png?rlkey=ebra9lb3zs9iyc3fi6gkre9g6&dl=0