Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Power Automate PDF with Embedded Raster Image of Text is not converting using OCR

New Here ,
Mar 11, 2021 Mar 11, 2021

I'm using the new PDF to Excel feature and I have mixed raster based images and text inside PDF files.

The raster based image is a table format showing borders with numbers in the table cells.

This is coming over in the excel file as just an image and not OCR operations are occuring.

I don't see any specific settings I can use for how it is converting the PDF file.

Is there a way to force OCR on all raster based images within the PDF?

 

Thanks!

 

TOPICS
Power Automate
816
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 12, 2021 Mar 12, 2021

You would use the OCR service first and then export to Excel.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 15, 2021 Mar 15, 2021

Hello @Joel Geraci ,

I am using the new Power Automate connector https://helpx.adobe.com/document-cloud/help/pdf-connector-for-microsoft-power-automate.html to perform the OCR and convert to excel but I don't see many options like to force OCR on all images.

Is there another way to automate this with Power Automate?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 15, 2021 Mar 15, 2021

The OCR is a separate service but if you have a mixture of text and image, I don't think it's going to work.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 16, 2021 Mar 16, 2021

Thanks for the suggestion to use OCR first.

Through my testing it seems that PDF to Excel seems to create the same result as PDF to OCR to Excel.

Both processes perform OCR and both have trouble processing OCR on all images withing the PDF.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 17, 2021 Mar 17, 2021

I'm fairly certain that the OCR service only works on image-only PDF. I don't think we have a solution to convert a mixture of text and image to just text.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Mar 17, 2021 Mar 17, 2021
LATEST

Thanks Joel,

I am running mixed pdf's and some work but it's hit and miss.

I have a meeting with an Adobe Developer Relation Speciailist Friday.

I'll let you know if anything changes.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources