Skip to main content
Participant
September 3, 2021
Question

OCR working only on some scanned images

  • September 3, 2021
  • 1 reply
  • 902 views

All of a sudden I'm having issues with OCR in Acrobat Pro DC.  I have a PDF containing scans of 10 DVD slip covers, and I'm trying to turn the back descriptions into selectable text so I can copy and paste it into a database app.  When I tell Acrobat to recognize text, only 2 of the scanned images turn into selectable text (and one of those is only a few words).  This seems like it should be pretty straight-forward, so what's going on?  Here's a video demonstrating how OCR works perfectly on the second page but not at all on the first.  The only thing I can think of that may be an issue is the darker background on the first page, but the text is so clear, I can't imagine that Adobe isn't advanced enough to distinguish the text.  It's not like it's dark gray text on a black background.  

 

Video Link:

https://youtu.be/s2ZPMsduOKQ

This topic has been closed for replies.

1 reply

Amal.
Legend
September 3, 2021

Hi Usagora

 

Hope you are doing well and sorry for the trouble. As described, the Scan OCR is working on some pages and not on the other

 

Is this a behavior with a particular PDF file file or with all the PDFs? Please try with a different PDF file and check. Also please share the sample PDF file so that we can try to check it at our end.

 

Would you mind sharing the the version of the Adobe Acrobat DC you are using? To check the version go to Help > About Acrobat and make sure you have the latest version installed. Go to Help > Check for updates and reboot the computer once.

 

Also try to reset the Acrobat preferences as described in the help page https://community.adobe.com/t5/acrobat-discussions/how-to-reset-acrobat-preference-settings-to-default/td-p/4792082 and see if that works.

 

You may also try to create a new test user profile with full admin rights or enable the root account in Mac and try using the application there and check.

 

Regards

Amal

usagoraAuthor
Participant
September 3, 2021

I have the latest version: 

Adobe Acrobat Pro DC
Continuous Release | Version 2021.005.20058

 

I've tried resetting preferences and restarting the computer, but the issue persists.  I found a random image online of the back of a DVD and imported into Acrobat to see if OCR would work on it.  I've attached that image file.  See if you can reproduce this:

 

1. open the image in Acrobat

2. click on Tools > Scan & OCR

3. click on Recognize Text > In This File

4. verify settings are:

  • Document Language = English (US)
  • Output = Searchable Image
  • Downsample To = 600 dp

5. click on Recognize Text (blue button)

 

What happens for me is that all the text in the upper half of the image (above the yellow-outlined "bonus features" box) is NOT selectable, whereas almost all the text below that box IS selectable.

 

So it appears to be a crapshoot as to what you actually get after running OCR.  Sometimes all, sometimes nothing, and sometimes partial.