Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Seeking Solutions: Preserving Table Structure in JSON Output with Adobe PDF Extract API for RAG App

Community Beginner ,
Jun 10, 2024 Jun 10, 2024

Hello all! Iused adobe extract pdf API service to parse a pdf. Pdf and output JSON is attached to this message.  I believe the Json output doesn't preserve the table structure. If I pass this data to an LLM, it is not able to answer relevant questions about this data as the table structure is not preserved. How should I go about this? I want to use adobe API to build a RAG application. Is there a way to preserve the table structure within the Json file? for example, I need outputs such as like this:

{

"Input (DC)":

{ "MVPS 4000-S2": null

, "MVPS 4200-S2": null

},

 

"Available inverters": {

"MVPS 4000-S2": "1 x SCS 3450 UP or 1 x SCS 3450 UP-XT",

"MVPS 4200-S2": "1 x SCS 3600 UP or 1 x SCS 3600 UP-XT"

},

 

"Max. input voltage": {

"MVPS 4000-S2": "1500 V",

"MVPS 4200-S2": "1500 V"

},

 

"Number of DC inputs": {

"MVPS 4000-S2": "dependent on the selected inverters",

"MVPS 4200-S2": null

},

 

"Integrated zone monitoring": {

"MVPS 4000-S2": "â—‹",

"MVPS 4200-S2": null

},

 

"Available DC fuse sizes (per input)": {

"MVPS 4000-S2": "200 A, 250 A, 315 A, 350 A, 400 A, 450 A, 500 A",

"MVPS 4200-S2": null

},

 

I know it can also generate csv, but the csv doesnt have any other information that might be present in the pdf. 

 

TOPICS
PDF Extract API
781
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Jun 10, 2024 Jun 10, 2024

I generally post-process the JSON from extract to create a Markdown file. When I hit a table, I read past it, read in the .CSV as a Markdown table, then contuinue with the JSON. It works great. I have some Node.JS code I can share if you like. 

Translate
Community Expert ,
Jun 10, 2024 Jun 10, 2024

I generally post-process the JSON from extract to create a Markdown file. When I hit a table, I read past it, read in the .CSV as a Markdown table, then contuinue with the JSON. It works great. I have some Node.JS code I can share if you like. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 10, 2024 Jun 10, 2024

Hi Joel,

 

Thanks a lot for the reply! Yes, would really help if you can share your Node.JS code.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 10, 2024 Jun 10, 2024

It's in a private git repo. If you are comfortable doing so, send me a private message with your github ID and I'll add you as a collaborator. I eventually plan on making it opensource once I'm past the work-in-progress. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 10, 2024 Jun 10, 2024

Thanks Joel! I just sent you a personal message that has my Github ID. 

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 09, 2025 Feb 09, 2025

were you able to get the table structure into the json? If you have the code please let me know, thanks

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 10, 2025 Feb 10, 2025
LATEST

Hi Joel

Were you able to make it open source? If so can I also get acces to the code?

Thanks


Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources