Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Unable to to extract text from drawing PDF with PDF Extract API

New Here ,
Oct 12, 2021 Oct 12, 2021

Hello,

default9ngizh1vietx_0-1634033728066.pngexpand image

Highlighted portions in the above section of the PDF are vectors (selectebale text), but I am unable to extract any text data from this pdf.

 

Attached: the drawing PDF and the JSON result.

 

Thanks,

Adam.

TOPICS
PDF Extract API
711
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 12, 2021 Oct 12, 2021

The entirte page is being seen as a graphic so no text is being read. Do I have your permission to send this to our Engineering team as a sample file to train the AI?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 12, 2021 Oct 12, 2021
LATEST

Sure, feel free to use this file.

 

Also, I have the same problem with any raster PDF files (scan pdfs), so I have tried first to run it through OCR API service and then I used the Extract API service, even though still no text is being read.

 

Is there is any workaround to optimize/convert raster PDF files (searchable), so the Extract API service will be able to recognize the text at the lower layer?

 

Thanks,

Adam.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources