• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Watermark detection in PDF Extract API

New Here ,
Apr 04, 2022 Apr 04, 2022

Copy link to clipboard

Copied

I'm having issues where watermarking is interfering with extracting tables from some documents. I made an example document that fails to detect the existance of a table (see attached). I have other documents that I'm trying to extract data from that where the watermark is interfering with the extract API. Unfortunetly, I can't share those documents. Hopefully the example document is illustrative enough.

 

The logging I get is as follows:

INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:All validations successfully done. Beginning ExtractPDF operation execution
INFO:adobe.pdfservices.operation.pdfops.extract_pdf_operation:Extract Operation Successful - Transaction ID: 3I6Y4FDws6sqf1xhkgvtFStaOQeVolt6
INFO:adobe.pdfservices.operation.internal.io.file_ref_impl:Moving file at /tmp/extractSdkResult/42afb4a4b44911ec952900155d058d1f.zip to target /home/lei/output/example.zip

I attached the pdf and the service output.

TOPICS
PDF Extract API

Views

327

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 04, 2022 Apr 04, 2022

Copy link to clipboard

Copied

Is there anyway to remove the watermark or get the extract API to ignore the watermark?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Apr 12, 2022 Apr 12, 2022

Copy link to clipboard

Copied

LATEST

No luck probably on having PDF Extract API to ignore the watermark.

 

It looks like the watermark in your PDF isn't a properly generated watermark because otherwise I would typically suggest that you go into Acrobat and select "Remove Watermark", but that doesn't work because it isn't written into the file like a watermark should, just as an object.

 

You can remove the watermark manually in Adobe Acrobat DC by clicking on Edit PDF and selecting each of the characters and deleting them.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources