Nested Table Data Extraction

Report · Oct 27, 2024

Hello,

I am using PDF Extract API to try and extract data from tables.

It works well with standard tables but when it comes to complex tables, specifically nested tables, there is data loss.

Has anyone experienced similar issues or could provide suggestions on how to improve extraction accuracy for nested tables?

Thank you!

Report · Oct 28, 2024

Can you share an example? If the PDF is something you aren't comfortable sharing publicly, please email me at jedimaster@adobe.com.

Report · Oct 28, 2024

Sure, here are some files I was testing with

Report · Oct 29, 2024

Hmm. So, when I extract this, I get one table. The XLSX file shows the main table, with the sub tables in it. To me... this is right. It is one table that is complex, but the data is there. I can also see all the table cells in the JSON. What data, specifically, is being lost? (Using samplePDF.pdf)

Report · Mar 22, 2025

Hi Lakshana, even i tries but i did not get any solutiosn to extarct data from nested tables, let me know if you found soltion.