API Adobe PDF Extract text JSON missed result
Copy link to clipboard
Copied
Sorry to repost here, I can't delete the other post in the general discussion forum.
I am using Adobe PDF Extractor API. The <require('@adobe/pdfservices-node-sdk')> thing. It converts PDF to a readable JSON file.
I have a simple PDF file that has basic words on the corners, 4 per corner, 2 pages. Total results should be 8, but I am getting only 5 elements when examining the output JSON file.
How is this API missing such a simple test case?
If it can't extract information accurately from a basic example, how much confidence can I have for much larger more complex PDF's?
Should have: top left, top right, bottom left, bottom right, top left 2, top right 2, bottom left 2, bottom right 2.
Copy link to clipboard
Copied
in the future, to find the best place to post your message, use the list here, https://community.adobe.com/
p.s. i don't think the adobe website, and forums in particular, are easy to navigate, so don't spend a lot of time searching that forum list. do your best and we'll move the post if it helps you get responses.
<moved from using the community>
Copy link to clipboard
Copied
I have a similar problem; one very important field in the document header is not exported in the JSON output AT ALL. It's the only missing element on the entire document.
Does anybody know what factors cause this API to recognize fields vs. not recognize them?
Copy link to clipboard
Copied
Hope you are doing well. Sorry for the trouble, and the delayed response.
If you are still looking for a solution, here are a few points I would look towards to fix this:
- Check text positioning to ensure it's within the printable area of the page.
- Verify text is machine-readable (not part of an image or embedded in a graphic).
- Inspect PDF layers and content structure to ensure all text elements are properly placed.
- Consider increasing the quality of the PDF (DPI or OCR).
- Review the raw API output to check if the missing elements are in a different part of the result.
Hope this will give you a better clarity on what to look for.
-Souvik

