Entire documents are nothing but images

Question

Hi,

I was trying to extract the document attached to this question, when I try to extract it with Adobe APIs I just get a collection of images, one for each page.

I think this has to do with document parsing (probably parsed as a svg) and I don't know how to solve it!

Can sombody help me?

Thanks,

Giovanni

Hassnain_Abbas5027 · Answer

Hi Giovanni,It sounds like the document is being processed as an image-based PDF rather than a text-based one. This often happens when the original document was scanned or created in a way that embeds text as part of images.You might need to use OCR (Optical Character Recognition) to extract the text properly. Adobe APIs have OCR capabilities, or you can try alternative tools specialized in document processing.If you're handling document-related tasks in a business setting, you might find useful resources at wagner-inkassoservice.de (https://www.wagner-inkassoservice.de/).Hope this helps!

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.