Participant

Question

Renderable text

Forum|Forum|13 years ago
September 18, 2012
9 replies
185900 views

I am running adobe reader X 10.1.4 and adobe X pro 10.1.4 but evertime I scan in a document and then try to recognize the text, I get this error, " Acrobat could not perform recognition (OCR) on this page becasue: This page contains renderable text." I do not always get this problem. How can I fix it?

This topic has been closed for replies.

C

cthombor

Participating Frequently

Building from previous postings in this thread (thanks to everyone who contributed constructively over the years to workarounds for this longstanding defect), I have devised a new recipe. It worked for me just now in Acro Pro DC 2021.007.20091, on a 192-page 20 MB document that has dozens of bookmarks. Yay! (Standard disclaimer: YMMV. If the recipe below fails for you, please try to find a mod that works for you and post it here. If my recipe works for you, please upvote this posting so that others may find it.)

Open PDF
Print
Set printer driver to "Adobe PDF"
On print dialog window, click "Advanced" and set to "Print as Image". I use 150 dpi, but you may want a different quality setting. (If "Print as Image" is greyed-out, try unticking "Print in grayscale (black and white)" on the main "Print..." menu.)
On print dialog window, the original poster of this recipe had ticked the "Auto-Rotate and Center" box, with the following note: "chances are you will have to review document to set correct paper orientation, because tables that have text vertical and horizontal will confuse the orientation detection.
On print dialog window, the original poster of this recipe had ticked the "Choose Paper Source by PDF page size" box, with the note: "This will allow different sized pages to be generated. Without it checked, the pages will be cropped."
Run OCR on the resulting saved document, and save.
Open original document. In older versions of Acrobat: select Document > Replace Pages, select the OCRed copy, then Replace all pages. In Acrobat DC: choose Tools & Organize Pages, select all thumbnails, click Replace from the Organize Pages toolbar, choose the file with the pages you want to include. (It's a good idea to confirm that the documents have the same number of pages, otherwise all bets are off regarding the bookmarks in the new document.)
Save the results as a new file
Confirm (with some random testing) that bookmarks are accurate and that all pages are fully OCRd

BTW https://helpx.adobe.com/acrobat/kb/error-could-perform-recognition-acrobat.html is now thoroughly out of date.

C

coolfizzin

Participant

Thanks for the summary. Here's another tip.

I kept getting the "renderable text" error within PDFs that had been freshly minted from mere TIFF images, so that they really had no text whatsoever. I realized that this error really has nothing to do with text. When the OCR engine fails, it often gives this error, even if renderable text is not the real cause.

In my case, I determined that the real cause was the print size. The images did not have their print size or DPI set correctly, so that Acrobat thought they were 40" high. This apparently confused the OCR engine, which must rely on print size in some regard. All I had to do was use XnViewMP's "Batch Convert" feature to change the DPI of all of the TIFF files to the correct 600dpi before combining them into a PDF using Acrobat. After that, OCR worked correctly.

A

Anonymous

Just use the following link. It is easy and super fast. It converts the Renderable text into a word document.However, you will still need to spend some 2-3 minutes on formatting the new word doc.

Free Online OCR - convert scanned PDF and images to Word, JPEG to Word

Happy New Year

Sameer Pimpalkhute

D

dellacorwin

Participating Frequently

Some of these suggestions should work, but....

(I have the error message "Acrobat cannot OCR this page bc it contains renderable text" when trying to OCR a 350 page .pdf with Acrobat X Pro on a MacBook Pro)...when I send it to print, I do not have the choice of a print driver named "Adobe PDF," nor can I enter it.

I can do everything else under advanced, but the solution does not work, I guess because I am unable to select a print driver named "Adobe PDF."

Is there any one who can enlighten me?

T

Test Screen Name

Legend

That driver (and that solution) no longer exist in Mac OS. You should only OCR scanned documents -- and only need to OCR scanned documents. Where is this document from/how is it made, and why do you want to OCR it?

D

dellacorwin

Participating Frequently

Thanks for your feedback.

PROJECT: 15 illustrated telecom training manuals, ranging from 99-350 pages, with a hyperlinked within-the-document Table of Contents, require frequent updating, but original word doc files were missing, presumed deleted at an unknown point in time. I was asked to convert 15 .pdfs back to word .docx for time sensitive editing and republication.

PROBLEM: The messed-up formatting in every single line of text when .pdf were saved to word .docx was unacceptable. I was asked to rekey every manual and thought to myself, 'no way.’

SOLUTION: After several failures with OCR scanning b/c every other page, it seemed, had renderable text, I successfully converted one 236-page manual to individual .tif files and recombined them into one .pdf binder. The OCR scan finally worked perfectly. I saved the double converted file as a word docx for editing.

Waiting to hear from client if the new .docx file is free of substantive formatting errors, overt and hidden. He’d be happy with no formatting b/c it's easy to reformat the manuals as a whole, the style is set. The .ai illustrations will be removed and replaced with updated illustrations in the new .docx.

What would you do to bring the 15 manuals back to format-error free or unformatted, word .docx for frequent updating?

Linda

Z

zork

Participating Frequently

This seems to be a recurring problem. The best summary of solutions that I have found is at http://nlsblog.org/2014/03/10/adobe-acrobat-renderable-text/ in particular try Solution 4: “Sanitize”. Seems to keep best resolution and does not blow up file size.

J

jdalian

Participant

i had pdf files that were experiencing the same issue so i went to online2pdf.com, reconverted them to pdf (from pdf to pdf), tried ocr on the reconverted pdfs and it worked for me.

B

bbrochstein

Participant

This worked for me! I tried a bunch of the other stuff here, too. My document was 350 pages long.

A

apangasa11851843

Adobe Employee

OCR first checks if there is any renderable text present on the document. If it finds any text content already present there, if would quit with the message mentioned by you. So, probably the document on which you encounter the error is already having some text content on it. It can be either header/footer lying beyond the appropriate margins or ClearScan (OCR) run while creating the document.

Would it be possible for you to share a sample document which has this issue?

P

PDF_Writer1

Participant

I found this and it worked for me:

How to fix "Text recognition" (OCR) for "Find" [Ctrl+F] in Adobe Acrobat without converting to TIFF or XPS printer.

Use "Nuance PDF Create Assistant" or maybe similar software.

Click "Add" then "Open file..."

Select file - double click or "Open"

Select "Create a PDF for each input document"

Select (below) "Searchable PDF"

Click the gear button "Start PDF creation (Alt + G)"

In the "Save As" box, Choose "File name:", Click "Save".

"Print Info" Box "Job Queue", wait till you see the file in "PDF Creation Result" box.

DONE! Click "Close"

You can now do word searches where this was not possible.

The PDF file will retain original clarity and bookmarks.

K

Kevin_N_Weinhold

Participating Frequently

Forget about TIFF/JPG or printing to XPS. Just print the PDF to the Acrobat print driver with settings (advanced) "as image". Be sure that print settings will use the existing page size or else larger pages will be cropped. I set the dpi to 300. After printing, the document will be ready to be OCRed by Acrobat. This solution makes smaller images (but, if you use OCR "Searchable Image (exact)" it will retain existing image size). It also "fixes" all sorts of issues I've encountered when I used to dump the PDF to JPG and convert back to PDF. I'm using Acrobat 8.3.1 and have had no problems with newer PDF formats using this method.