Skip to main content
Participant
November 1, 2019
Answered

Import accessible text into scanned PDF

  • November 1, 2019
  • 2 replies
  • 719 views

I have a number of scanned text documents that need to remain in their original state as far as appearance, but should be accessible/searchable. Some of these scans are poor quality and the OCR process does not always recognize text accurately for paragraphs or even full pages at a time. The process of reviewing and correcting recognized text is tedious, the editing window is tiny, and it would honestly be easier for me to simply type the text in another editor and import it in. I can see how to export accessible text, but there doesn't seem to be an option to import it, and I was wondering if anyone had any work-arounds or tips? Anything to make this process less painful would be appreciated!

This topic has been closed for replies.
Correct answer Bevi Chagnon - PubCom.com

Do not know of any way to import large amounts of text into Acrobat. That's not what the program was designed to do.

All OCR utilities will give mixed results on poor originals; this is the "garbage in, garbage out" theory in practice.

 

Try to improve the OCR outcome:

  1. Adjust Acrobat's OCR/Recognize text settings.
  2. Locate better originals.
  3. Try another brand of OCR software. We have good results with Abbyy FineReader's OCR software.

2 replies

a_C_student16379412
Inspiring
November 5, 2019

If I understand correctly what you are trying to do, here's a suggestion ...

 

Type the text into Word or other editor. Save as PDF. In Acrobat Pro, make any needed corrections to ensure the text is tagged properly and the PDF is otherwise accessible. 

 

In the scanned PDF, mark all content as background artifacts. In the Page Thumbnails pane, insert the file containing the accessible text (right-click the first page > Insert Pages > From File. In the Content pane, ensure the accessible text is positioned before the scanned stuff in the content order. 

 

The idea is to hide the accessible text behind the scanned stuff. The visual representation will show just the scan, but AT will read the tagged text.

Bevi Chagnon - PubCom.com
Legend
November 1, 2019

Do not know of any way to import large amounts of text into Acrobat. That's not what the program was designed to do.

All OCR utilities will give mixed results on poor originals; this is the "garbage in, garbage out" theory in practice.

 

Try to improve the OCR outcome:

  1. Adjust Acrobat's OCR/Recognize text settings.
  2. Locate better originals.
  3. Try another brand of OCR software. We have good results with Abbyy FineReader's OCR software.
|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents ||    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |