How to import ready OCR text (XML or TXT) in a PDF?

Question

Suggested my issue on uservoice:https://acrobat.uservoice.com/forums/590923-acrobat-for-windows-and-mac/suggestions/38916985-import-alto-xml-files-with-ready-ocr-to-pdf Following scenario: I have an image-only PDF file (a scanned book) with 500 pages and 500 Alto-XML files with OCR-Text for each corresponding page of that PDF File. That OCR-XML files were exported from the original searchable OCR-PDF-file. I don't have that source OCR-PDF file, it comes from a German library (StaBi Berlin). Unfortunately, they don't offer to download the OCR-PDF file directly. You can just download an image-only PDF file of a book and the corresponding OCR-XML-files from separately. Or all OCR-Text in one txt-file. (If you don't believe me, see for yourself: See hereYou can change the language to english on the bottom right corner) So now I am looking for a way to import those 500 XML-Files back to each corresponding page of that image-only PDF so that I get a searchable OCR-PDF file in the end. Is there a way to do it with Acrobat DC (or, if not, maybe with assistant tools?) Best regards,Minsutoreru

jane-e · Answer

Hi Minsutoreru,While Adobe Acrobat does not support this, you can put in a feature request here:https://acrobat.uservoice.comPost your link back in this thread so others might see it and vote on it.~ Jane

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded