Watermark detection in PDF Extract API

Question

I'm having issues where watermarking is interfering with extracting tables from some documents. I made an example document that fails to detect the existance of a table (see attached). I have other documents that I'm trying to extract data from that where the watermark is interfering with the extract API. Unfortunetly, I can't share those documents. Hopefully the example document is illustrative enough.

The logging I get is as follows:

INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:All validations successfully done. Beginning ExtractPDF operation execution
INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:Extract Operation Successful - Transaction ID: 3I6Y4FDws6sqf1xhkgvtFStaOQeVolt6
INFO:adobe.pdfservices.operation.internal.io.file_ref_impl:Moving file at /tmp/extractSdkResult/42afb4a4b44911ec952900155d058d1f.zip to target /home/lei/output/example.zip

I attached the pdf and the service output.

ld72808862 · Answer

Is there anyway to remove the watermark or get the extract API to ignore the watermark?

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.