Skip to main content
Marcel31329687rzel
New Participant
July 27, 2023
Answered

How to maintain OCR when converting from PDF to Word

  • July 27, 2023
  • 1 reply
  • 562 views

Hello everyone,

 

I have PDFs that contain OCR information from the scanning process. When I open the PDF files in Adobe Acrobat 2017, the text is selectable and I can copy & paste everything. However, when I convert these files to Word - using Adobe Acrobat 2017 - the reslusting Word often contain images instead of editable text. I assume that this has to do with grey or colored backgrounds in some cases and with poor printing quality (of the scanned documents) in others. 

Does Adbone Acrobat 2017 do a completely new OCR when converting to Word instead of using the existing OCR data in the PDF file? If so, is there a way to make it use the existing OCR data?

Also, weirdly, repeating the conversion process leads to different results. Sometimes the result is mostly or partially editable and something it's mostly images or only one page-sized image.

I have some PDFs that are converted perfectly, and so far they differ in two ways:

 

1. The file details show a different "PDF-Version: 1.7, Adobe Extension Level 5 (Acrobat 9.x)" - the files that don't convert properly show "PDF-Version: 1.4 (Acrobat 5.x).

 

2. The files have a mostly white background and are of slightly better image quality. 

 

Thanks a lot in advance for any help!

 

Best regards,

Marcel

This topic has been closed for replies.
Correct answer Bernd Alheit

Try the forum for Adobe Acrobat.

1 reply

Bernd Alheit
Bernd AlheitCorrect answer
Community Expert
July 27, 2023

Try the forum for Adobe Acrobat.

Marcel31329687rzel
New Participant
July 27, 2023

Thank you, Bernd, I postet it there now and will close it here.