Skip to main content
Participant
August 24, 2018
Question

OCR scanning with Adobe Acrobat Pro DC.

  • August 24, 2018
  • 2 replies
  • 6748 views

I have been using the Acrobat for about 15 years.  Since I moved to using Acrobat DC Pro,  I have problems with OCR.   Material that I expect just normally will be OCR scanned is not.  I do work with a lot of old newspaper files that are available in PDF or that are downloaded in JPG that I turn into PDF.   I am used to using the acrobat and making these files available to simply copy into Word or notepad, with the usual minor corrections of the OCR one expects with the Acrobat.   

Howeverm,  this does NOT seem to work with the Acrobat Pro.  Right now I am working with a newspaper page and only a small portion of it works, if you click on edit  document, or go to the edit document tool.  The rest of the text is seen by the Acrobat as an image file.

If I am going to continue my work as I have done it for years, then I either need to find out how to do OCR on text PDFs as I did for years with the Acrobat DC Pro, or I need to find another program to do OCR on text PDFS. 

\

Help

Tony T\

West Palm, Beach Florida

This topic has been closed for replies.

2 replies

Lovekesh Garg
Adobe Employee
Adobe Employee
August 27, 2018

It might be because of the PDF you are using is already partially OCRed.

Please try Tools> Enhance scan> Recognize text> In this file> Settings

Select "Searchable Image Exact" option and click Recognize text button.

If issue still not resolved, please share some information

- Acrobat version you are using

- OS details

- Sample files having issues(How to share a file using Adobe Document Cloud)

Thanks.

Participant
August 27, 2018

First of all I am working with Windows 10 m Acrobat DC Pro 2018.011

Yes a portion of it works well, but the rest of it was treated only as an

image file.

I OCRed article I wanted by uploading the PDF to my google drive, and

opening the PDF with Google Docs and the article was OCRed as good or as

bad as I expect with scanned newspaper from the 1940s..

I will try to go through the steps you advocate shouldI run into this

problem again since I do a lot of research in newspaper databases

Thank you

Tony

On Mon, Aug 27, 2018 at 6:09 AM Lovekesh Garg <forums_noreply@adobe.com>

gary_sc
Community Expert
Community Expert
August 25, 2018

HI Tony,

I'm not all that sure what's taking place because there "should" not be problems or issues with Acro-DC Pro doing OCR-ing on images (jpeg, or tif). As such, there either is a problem with one of your settings, your procedure, or a conflict with the way you have your system set up.

But obviously you're having a problem, so let's see if we can figure out what the problem is.

First off, where are you getting these images? Are you making them yourself? Do you download them? Where are they from?

Also, you talk about a "newspaper page." Does that mean you're working with an entire page from a newspaper? That means there are many different stories, ads, images, etc. Are these the same kind of documents you were working with before with previous versions of Acrobat? Can you share one of these pages?

A lot of folks think that when they take an image and turn it into a PDF, that their job is done not realizing that if an image is OCR-ed, it is a much more robust document than just an image PDF. So does it make a difference if you downloaded a PDF-image or if you scan the newspaper yourself into a PDF?

Lastly, just to make sure I understand all that you have, are you on a PC or a Mac, what is the OS, and what release (version) number of Acrobat are you using?

I hope I can help you.