Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Adobe PDF Extract API Question on Extraction Method

New Here ,
Aug 23, 2023 Aug 23, 2023

How does the Adobe PDF Extract API extract text from PDF (to go from PDF to CSV)?

Does it naturally extract text from PDF and convert to CSV as if we were doing it ourself using Acrobat in Desktop? Or does it always try to use OCR and Sensei AI to extract and structure text?

Basically, I am trying to understand how much reliance is on AI here versus Adobe's natural ability to convert a pdf into csv based on the actual text/characters.

TOPICS
PDF Extract API , PDF Services API
700
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 23, 2023 Aug 23, 2023

We use both AI and algorithms but we only OCR when we get an image-only PDF.  Most of the time we operate on native PDF.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 23, 2023 Aug 23, 2023

So Export / Convert PDF does conversion from PDF to XLSX using native PDF, as if I were doing it in Acrobat Desktop - no AI and OCR.

And then Extract PDF uses AI / Algorithms to extract text, image (OCR), and tables.

Is this correct way to understand this?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 23, 2023 Aug 23, 2023
LATEST

Correct. The AI in Extract does a much better job of "understanding" complex tables. For example, tables with merged cells and rows with verticallyand horizontally centered cells. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources