Copy link to clipboard
Copied
I am trying to decide if Adobe Extract API will be up to the task I need. I am a little concerned as I don't see a lot of support resources. So I have used the example at https://developer.adobe.com/document-services/docs/overview/pdf-extract-api/quickstarts/python/ to try and extract tables from pdf documents. So far the results are disheartening. There is no opportunity on that page to pose a question.
An area that "looks like a table" in pdf to eye in my target pdf, with column Date of Birth and a date below, renders in a csv file in one cell as Date of Birth 01/01/1955.
I am fine with having to tweek things but I don't see how this product works if there is no framework for me to determine why the code renders certain "tables" correctly and others no. If it is just a well it either works or it doesn't work kind of thing I don't see how I can use it.
Copy link to clipboard
Copied
Can you share the PDF in question?
Copy link to clipboard
Copied
Yes but I will have to redact the personal info in it.
Copy link to clipboard
Copied
Copy link to clipboard
Copied
The AI is definitely confused by the table layout in the fact that the cells are vertically center-aligned making it challenging to determine what constitutes a row. Can I give this file to our team to help train the AI?
Copy link to clipboard
Copied
Sure!