Skip to main content
Participating Frequently
January 23, 2024
Question

PDF extract api (table)

  • January 23, 2024
  • 4 replies
  • 1159 views

I am using the extract API for PDF tables to export to an Excel file (pdfservices-node-sdk 3.2.0). In the same PDF file, there are 4 identical tables, but when exporting, only 2 tables are extracted. What factors could be affecting the export of tables? How can I adjust to achieve the best results?

Only the table highlighted in red is successfully exported to Excel.

 

    4 replies

    Participant
    February 18, 2025

    can you please share the code of extracting table and savign them to a excel file?

    Adobe Employee
    February 18, 2025

    It's in the samples.

    Participating Frequently
    January 24, 2024

    "There's a bit of confusion. More accurately, figures 2 and 4 have Excel table results, while figures 1 and 3 cannot generate an Excel file."

    Joel Geraci
    Community Expert
    Community Expert
    January 23, 2024

    It's an AI. The code to do the page segmentation can be off when deciding if something is a figure vs a table vs. text. Honestly, given the proximity to the drawings above the tables, I'd have thought that the bottom two tables would get read and not the top 2.

    Unfortunately, there are no "knobs" to turn to get better results but with your permission, I can send your file to engineering to train for this sort of thing. 

    Participating Frequently
    January 24, 2024

    Thanks you so much

    Participating Frequently
    January 23, 2024

    THis my file.