Using the extract, not getting some text that is in the margin of the page.

Forum|Forum|4 years ago
July 23, 2021
3 replies
1517 views

Just for reference, on page 31 at the bottom is some text "Reference ID: 3610837" It is not in the json output from the extraction API. I have attached the original PDF as well as the json output of your tool.

structuredData.zip

Harvoni.pdf

This topic has been closed for replies.

R

Reuven27686304lwio

Participant

Same issue here, a lot of the good stuff is in the headers and footers. We're in 2023 now, any idea when getting them would be possible?

Joel Geraci

Community Expert

I think Extract API is interpreting that area as a footer and ignoring it. Unfortunately, there is no setting to force it to not do that.

erich64645996Author

Participant

Not that it's a showstopper for us, but if that is the confirmed reason that the text is not being extracted from the footer, is there a chance in a future sprint/improvement cycle of the tool that a config option can be added to broaden the text search to the entire page?

Joel Geraci

Community Expert

Great minds... I've already submitted that as a feature request. It'll be important for documents that have been bates numbered too.

erich64645996Author

Participant

I attached a Greenshot image capture of the text I'm referring to.

Remix with Firefly Community Gallery

Thousands of free creations to fall in love with and remix in Firefly.

Explore now

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.