Skip to main content
Participant
October 16, 2024
Question

Re OCR, and force OCR for pages that have images and texts

  • October 16, 2024
  • 1 reply
  • 483 views

I use this code to do a test on a pdf that has texts and images.

https://github.com/adobe/pdfservices-python-sdk-samples/blob/main/src/ocrpdf/ocr_pdf.py

In the images, there are digital texts of course. And these texts are not recognized by the ocr function.

 

I also use ocrmypdf package :https://ocrmypdf.readthedocs.io/en/latest/advanced.html

There are options force_ocr and redo_ocr.

 

Are there similar options with Adobe API ?

 

I know that with Adobe Pro, it is possible to redo the OCR, since I have a pro license, et succeeded to do the OCR with Adobe Pro Reader.

 

Thank you

This topic has been closed for replies.

1 reply

Joel Geraci
Community Expert
Community Expert
October 16, 2024

Can you share one of the input files you're using?

qsf_7077Author
Participant
October 16, 2024

Here is an example that i just created.

The table is an image, that i added to a normal pdf with texts.

Thank you