Acrobat doesn't ocr text - leaves them as images
Copy link to clipboard
Copied
May I have some help please?
I use OCR to allow me to highlight and mark up text of scanned or prepared PDFs. Mostly these are unencrypted academic documents. I find that Acrobat Pro DC often does not OCR text when I click Edit. Instead it leaves the scanned blocks as images denoted in the upper left hand corner by the landscape icon and pop up text that says "This is an image". Is DC capable of OCRing these images? If so, how do I force the application to OCR?
Thanks very much
Copy link to clipboard
Copied
So I've been experimenting. Acrobat DC OCR just doesn't seem to work on documents I download from academic sources. And there is no visual feedback from Acrobat so indicate a) that it might be progressing with an OCR, or b) that there is a problem.
Downloaded Abbyy FineReader and it has no problems with exactly the same documents.
I'd really appreciate some ideas here.
Thanks
Copy link to clipboard
Copied
Nudge
Copy link to clipboard
Copied
I have the same issue but with adobe acrobat XI. It is very annoying when some pages only convert 50% of the text into a searchable form.
As a lead test engineer, I have to scan in any manually executed tests, I don't care so much about the hand written portion, but it is sure easier to be able to search the text of every document in a folder of documents (1000s of pages) rather than have to open and guestimate what I am looking for. Eventually I will seek to try out some competitors and make the switch if I find something better.
Copy link to clipboard
Copied
I downloaded Acrobat Pro DC as a trial just to see if it would fix this exact issue. I either have random pages that won't capture text or Acrobat recognizes 1 character of text somewhere on the page so it WILL NOT capture the text on the rest of the page. Worse yet, I just got a set of pdfs that look fine and I thought they were text captured but when I tried to search the text nothing came up. I copied the text from the pdf - see below for comparison of what was on the screen and the text that was captured. Since Acrobat recognized the gibberish as text, I can't find a way to replace/recapture the text without reprinting. Anyone have a suggestion? I'm going to look into the AbbyyFineReader in the meantime . . .
Screen view: WHEREAS, after review of the Company's operations . . .
Text from pdf for this same phrase: , (’*’% +$-1<0990>30? 71<2 0 &75 8-6AB; 7809-<376;
Copy link to clipboard
Copied
so it's not just me awesome. maybe someone will fix it. competitor here I come.
Copy link to clipboard
Copied
Hi Paul,
Sorry for the issue you are facing. Can you please verify 1 thing:
- Open PDF and click on Edit tool
- In RHP, under 'Scanned Document' option make sure 'Revert to Image' is available
- If there is 'Convert to Text' option available, please click it. It will run OCR to recognize text.
You can also try Enhance Scan tool> Recognize Text> In this File> Settings(select Editable Text & Image option to run OCR)> Recognize Text
If your document has some non Editable content, it will give an error message.
Please share a sample file and error message(if any) using https://cloud.acrobat.com/send to help us identify and resolve the issue ASAP.
Thanks.
Copy link to clipboard
Copied
Here's a file that I can't get Acrobat to OCR at all:
https://files.acrobat.com/a/preview/c4ce00da-02db-44c2-b3ca-794be61da8db
R.
Copy link to clipboard
Copied
There is an issue running OCR on these files. I have logged a bug for you. Team will prioritize it and get back to you once fixed.
Thanks.
Copy link to clipboard
Copied
Hi Robert,
We have fixed the issue up to some extent where it leave page as it is without OCR. Confirmed the same with the file you shared. Please install latest update of Acrobat DC and update if it is fixed at your end.
Thanks.
Copy link to clipboard
Copied
I have had this issue when a scanned document is embedded in a vector page, the scanned document would not OCR. This issue appears to be fixed, Lovekesh, can you send me the fix list for this issue?
Copy link to clipboard
Copied
https://files.acrobat.com/a/preview/c3d2e801-a9f2-49fc-84de-03d81e202cd0
Hi,
When exportPDF this file to Word, the OCR tool works for the first 50 pages but leaves the rest as not editable image.
I've tried the "Convert the PDF to TIFF and back, and then rerun OCR" method but doesn't work either.
Can you help me with that?
Thanks
Nicolas
Copy link to clipboard
Copied
Are these locked documents? If so, you cannot OCR them.
Copy link to clipboard
Copied
Hi Nicolas,
Sorry for the issue you are facing. We are looking for this issue. We will update you once we identified any fix or workaround for the problem you are facing.
Meanwhile can you please confirm one thing,
- are you facing this issue only on this file or there are multiple such files.
- if there are multiple files, is there any common pattern like content type or source of files or number of pages.
Thanks.

Copy link to clipboard
Copied
Came across this post trying to fix a similar problem trying to OCR legal discovery. The only solution I've found so far is to export the PDF as a TIFF and then import the TIFF into Acrobat and run OCR. Hope this helps someone!
Copy link to clipboard
Copied
i wanna attach a one page doc , which has a graphic, the adobe said cannot process it because of a graphic element.
Copy link to clipboard
Copied
You can share the document using Adobe send
- Launch your Application
- Switch to Toll Center view and Open Send & Track
- Click on “Select Files to Send”
- A dialog will open from where you can choose the file/s you want to share
- The workflow page will appear with the file/s to be shared prepopulated
- Click on Create Link
- Your Local files will be uploaded to the Document Cloud and a Public Link will be generated
- Share the link with us
Copy link to clipboard
Copied
Where is the answer to this? Many of us have the same problem.
Copy link to clipboard
Copied
There could be multiple reasons for this, already improperly OCRed document, partial live content on the page
Can you please try the workaround: Fix the OCR error Could Not Perform Recognition in Acrobat
Also, you can try Enhance Scan> Recognize Text > In this file> Settings> "Searchable Image Exact" as OCR output style> Recognize Text
Please share the document if issue still not resolved.
Thanks.
Copy link to clipboard
Copied
Did this ever get fixed? The fix offered here is not realistic since it is time consuming and with many documents I'll be spending all my time trying to fix them. Besides, I'm paying a monthly subscription, it should work. I sent the same doc to a friend with Foxit and it worked beautifully the first time. I really don't want to pay for a new system. Thanks.
Copy link to clipboard
Copied
Same here, i got 10 pages of clean documents with gray background. only 1 text out of those 10 pages got scanned. this is horrible
Copy link to clipboard
Copied
I've found a workaround that seems to be working (although a bit time consuming if working on documents with multiple pages) as I am also experiencing this problem from a pdf document that I created myself from scanned documents. first 30% of the pages OCR'd OK then the rest remained as images.
The workaround is to extract the affected pages from the pdf, then open them in Photoshop, flatten the image and then save the file as a Photoshop PDF. Open the file in Acrobat and it recognises the text.
Copy link to clipboard
Copied
This worked for me today. thanks
Copy link to clipboard
Copied
I don't have photoshop so this isn't an option. I also can't share my document because it's internal business information. The only spaces that are not being read are an image of a table. The table data is all read except for the ones that had a background color change (highlighted background in Excel). Trying to extract said data from a PDF picture and it's not reading the colored background text.
Copy link to clipboard
Copied
Hello - I am having the an issue now with the latest update to Acrobat 2019. I used to be able to run a text recognition on encrypted/secured files, be it the pdf or a scanned jpg. With the software update I can no longer do that unfortunately, there's no option to recognize text but only "edit file" which is clearly not possible for secured files. This is very annoying as I need to do this a lot in my job. Grateful if you could please advise.


-
- 1
- 2