Scan - OCR/Export - Word

New Here ,
Aug 26, 2016 Aug 26, 2016

Copy link to clipboard

Copied

Hello,

I'm having a problem converting a scanned document from pdf to word. I'm using Adobe Acrobat Pro 9 on Mac.

I'm scanning sections of a book into pdf format, then I have ran them through OCR by exporting them to word in order to get editable text. I have two problems: (1) The scanned document is in landscape, but I need the final editable text in portrait and the text itself to be formatted like a normal word document, i.e., not the format in which it was scanned but as if I opened word and began to type sections of the book, trying to mirror how it is presented there - I cannot simply change the orientation from landscape to portrait because its a scanned document, and everything becomes scrambled and the text doesn't actually fit on the page; and (2) some of the text is not showing up because there are glitches or missing portions of text, some areas have strange symbols instead of text as well.

My biggest problem, at this point, is that I cannot seem to convert the scanned document to a word-formatted document in portrait. I really need the scanned document to cohere with the structure of a word document as editable text.

Thanks,

Joe

TOPICS
Acrobat SDK and JavaScript, Edit and convert PDFs

Views

576

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct Answer

Adobe Community Professional , Aug 26, 2016 Aug 26, 2016
When you run OCR, make sure to click on the "Edit..." button, and to set the output type to "ClearScan". Otherwise you end up with an image and hidden text. This way, the image gets completely removed from your document, and when you export to Word, only that recognized text is used. To fix the mis-recognized characters, use the "Find OCR Suspects" function and fix them in your PDF document before you export, or fix the problems after you exported in Word. OCR is not perfect, and the results dep...

Likes

Translate

Translate
Adobe Community Professional ,
Aug 26, 2016 Aug 26, 2016

Copy link to clipboard

Copied

When you run OCR, make sure to click on the "Edit..." button, and to set the output type to "ClearScan". Otherwise you end up with an image and hidden text. This way, the image gets completely removed from your document, and when you export to Word, only that recognized text is used.

To fix the mis-recognized characters, use the "Find OCR Suspects" function and fix them in your PDF document before you export, or fix the problems after you exported in Word. OCR is not perfect, and the results depend on the quality of your scan. Sometimes, the OCR algorithm just cannot figure out what a character is.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 30, 2016 Aug 30, 2016

Copy link to clipboard

Copied

Hello Karl,

Thank you so much for your prompt reply. Sorry I haven't gotten back to you sooner. Changing the settings to ClearScan has greatly enhanced the text, and removed any scrambled characters. But I am still having a problem getting the text itself to change into portrait with word-formatting. If I simply change the orientation, the page becomes scrambled - there are missing characters, the format is so messed up that almost nothing fits, and so on. Hence, the format is still, like the scanned document, in landscape. If I copy and paste sections of the landscape to make a portrait-style document, it ruins the format - nothing is aligned since it retains the landscape-formant but only on a single page.

If you could help with this aspect of the problem, that would be greatly appreciated.

Thanks,

Joe

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 02, 2016 Sep 02, 2016

Copy link to clipboard

Copied

Hello Karl,

This was a problem with Word and not Adobe Pro, I figured it out. Thanks again for your help!

Best,

Joe

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 06, 2020 Feb 06, 2020

Copy link to clipboard

Copied

LATEST

Where do I find "Find OCR Suspects" in Acrobat Pro DC (MacOS)?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines