Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

How to programmatically identify pdfs containing black boxes or black objects (attempting to identify incorrect attempts at redacting text)

New Here ,
Jan 19, 2017 Jan 19, 2017

We have identified a few pdfs where users thought they were redacting text but they were not.  They were doing things like "highlighting" text with the color black in other applications like Microsoft Word and then converting the document to a pdf.  When viewed in Adobe Acrobat, the text appears redacted because you see a black box instead of text but you soon discover that you can easily copy/paste the black box area into notepad and see the supposed redacted text or you can edit the object and just select the black box, then delete the black box and underneath is now revealed the supposed redacted text.

We know how to fix the pdfs to truly make the text redacted and we know the steps to give to the users to correctly make text redacted in the future.  What I am researching now is a way to quickly identify all pdfs affected by this issue (instead of the more tedious route of opening each pdf up and testing the blacked out areas).   Is it possible, based on how pdfs are coded in regards to objects of color black on a document, to programmatically identify suspect pdfs?

TOPICS
Edit and convert PDFs
805
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 19, 2017 Jan 19, 2017

Maybe with a plugin. A script can't do it.

On Thu, Jan 19, 2017 at 9:34 PM, jennifers534176 <forums_noreply@adobe.com>

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jan 19, 2017 Jan 19, 2017

Yes, a plug-in can do that, but you may have a hard time finding a plug-in that already implements this feature. My first candidate would be Enfocus PitStop Pro - they may have features that allow you to create a profile that will only find black boxes. They have an evaluation version, but PitStop has a steep learning curve: PitStop Pro Software - Detect & Correct Errors in PDFs

If this does not work out, you may have to look into a custom plug-in. If that's the route you want to go, feel free to get in touch with me (my contact information is on my profile page).

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 19, 2017 Jan 19, 2017
LATEST

Thank you so much for this information, I am reviewing the documentation for PitStop right now, I will post an update within the next day, fingers crossed!!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines