When using PDF Services API, the Adobe service is failing to recognize multiple pages from the input PDF, even though all pages in the document are identically formatted.

Forum|Forum|2 months ago
April 8, 2026
0 replies
11 views

I used Claude.AI to troubleshoot, here is the feedback:

Adobe Extract PDF API is silently dropping pages from this 94-page PDF. The text is fine. There's no difference in text format or layout between the missed and found contracts — pdftotext extracts them identically. So the common thread isn't about the content, it's about which pages Adobe Extract chose to skip. — pdftotext extracts every contract perfectly. Adobe just doesn't return text elements for certain pages, so the workflow never sees those Contract IDs.

The input PDF has 94 pages with 52 unique contracts
All 20 missing contracts have Contract ID: NNNNNNN in identical format to the 32 that succeeded — there's no text/regex issue
Every page has a footer: Affidavit: Page X of Y (that's the numbering you mentioned)
18 of 20 missing contracts are single-page (Page 1 of 1)
Pages 1-6 are ALL missing — the first 6 contracts were entirely skipped
The remaining missing contracts are scattered (pages 21, 29, 48-55, 74-75, 77, 84, 94)

Remix with Firefly Community Gallery

Thousands of free creations to fall in love with and remix in Firefly.

Explore now

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded