API Adobe PDF Extract text JSON missed results on a simple pdf

Forum|Forum|3 years ago
January 5, 2023
返信数 1.
696 ビュー

I have a simple PDF file that has basic words on the corners, 4 per corner, 2 pages. Total results should be 8, but I am getting only 5 elements when examining the output JSON file.

How is this API missing such a simple test case?

If it can't extract information accurately from a basic example, how much confidence can I have for much larger more complex PDF's?

Should have: top left, top right, bottom left, bottom right, top left 2, top right 2, bottom left 2, bottom right 2.

bounds_test.pdf

このトピックへの返信は締め切られました。

Joel Geraci

Community Expert

Actually, it'd probably do a better job on a more complex PDF. It's been trained to look at the layout and categorize page elements. This simple layout is confusing to it. I think the text at the bottom is being recognized as a footer so it's being ignored by the AI. That said, I've alerted our team and sent them a link to this thread.

J

jason27819072sr0h作成者

Participant

Thanks but I'm not impressed, especially if I need to use the adobe API token which is not entirely free.

For those who need an alternate, get PDF2JSON api, which is free to use unlimited, and can pass the most basic test case I provided.

サインアップ

ソーシャルログイン

コミュニティへログイン

ソーシャルログイン

ファイルをウイルススキャンする。

このファイルはダウンロードできません