OCR sometimes turns entire scanned PDF page into a single image — no text extracted during OCR + Ext

Question

Hi everyone,I’m using Adobe PDF Services API with NestJS to process scanned PDFs.My goal is to run OCR on the scanned file, then extract structured text and layout information for rendering in a Flutter frontend.However, I’ve noticed that sometimes the entire page is treated as one large image, and no text is extracted — even though the PDF clearly contains readable text after OCR.This causes layout issues in my Flutter app, where the image layer overlaps or replaces text.Implementation details const pollingURL = await pdfServices.submit({ job });
const pdfServicesResponse = await pdfServices.getJobResult({
  pollingURL,
  resultType: OCRResult
});Extract step:const params = new ExtractPDFParams({
  elementsToExtract: [ExtractElementType.TEXT, ExtractElementType.TABLES],
  addCharInfo: true,
  getStylingInfo: true,
  elementsToExtractRenditions: [
    ExtractRenditionsElementType.FIGURES,
    ExtractRenditionsElementType.TABLES,
  ],
});&#33; The issueFor some scanned PDFs, Adobe returns text and layout perfectly.But for others, after OCR, the ExtractPDFOperation only returns an image rendition of the page — with some part of text elements at all.It looks like the OCR recognized the content, but the extract phase still treats the entire page as an image.&#x2753; My questionsIs there a way to ensure OCR always embeds or exposes recognized text, instead of producing a full-page image?Can we configure OCR or ExtractPDF to force text-layer extraction even for low-quality scans?How can I detect programmatically (from the JSON output) when a page has only image content and no text layer?Are there known best practices for chaining OCR → ExtractPDF to ensure consistent text extraction results?&#x1f4bb; Tech contextBackend: NestJSAdobe PDF Services SDK: Node.jsFrontend: Flutter (renders text + layout from extracted JSON)File type: scanned invoices and forms (mixed quality)Any guidance, configuration examples, or recommended OCR parameters would be super helpful &#128591;Thanks in advance!

OCR sometimes turns entire scanned PDF page into a single image — no text extracted during OCR + Ext

! The issue

❓ My questions

💻 Tech context

! The issue

❓ My questions

💻 Tech context

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded