• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
4

How to FIND PDF files that have been Redacted without opening every single one to look.

Community Beginner ,
Jun 09, 2023 Jun 09, 2023

Copy link to clipboard

Copied

I work with thousands of documents for court cases and need to be able to search for and find all those that have been fully Redacted. What does Adobe do to indicate a file was redacted aside from renaming it with 'Redacted" at the end, which some individuals turn off, and how can we search for that property?  JavaScript seems to only find those Marked for Redaction, not fully redacted.

TOPICS
Edit and convert PDFs , How to , JavaScript , PDF , Standards and accessibility

Views

2.1K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jun 09, 2023 Jun 09, 2023

Copy link to clipboard

Copied

So far as I know, a redacted PDF just has the changes made, and the original info deleted. There is no way to identify it, or search for redactions. Some people, indeed, may need to run workflows where the action of redaction cannot be detected.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 11, 2023 Jun 11, 2023

Copy link to clipboard

Copied

The only thing I can think of is to search for files with large rectangular areas that are completely black.

This is of course very problematic, as files can contain large black images without being redacted, or the redaction can have another color, etc., but it's the only way to do it as there's no "this file has been redacted" tag or anything like that.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 12, 2023 Jun 12, 2023

Copy link to clipboard

Copied

Thank you, I expect you would know as it is your code I've used to find Marked, and fully Redacted files together, but how does one search for completely black rectangles, without having to open every file? Is that possible?  

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 12, 2023 Jun 12, 2023

Copy link to clipboard

Copied

This won't be possible using a script. Possibly with a plugin, or a stand-alone tool.

The latter can process files without displaying them (of course they have to be opened, at least at the memory level, to read their contents, though).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Oct 31, 2023 Oct 31, 2023

Copy link to clipboard

Copied

I am just working on to detect black rectangles to find redacted files, using Python + OpenCV.

  1. Convert PDF to image file such as jpeg or png
  2. read image file as grey scale image
  3. convert to binary image
  4. using dilation, erase text and noises
  5. detect contours and if there are,  highlight black rectangles
  6. save image file

 

I'm still working on but i think it will work as i hope

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 11, 2023 Jun 11, 2023

Copy link to clipboard

Copied

What does you mean with "fully Redacted" ?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 12, 2023 Jun 12, 2023

Copy link to clipboard

Copied

Redactions made to text using the PDF Redact tool that have been "Applied".  Not simply marked for, which creates the red-box where one can still see the text to be redacted.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 01, 2023 Nov 01, 2023

Copy link to clipboard

Copied

You need to reverse your process.

Since it's impossible to detect an absence, you should look for a presence.

In other words, look for documents that have not been redacted.
I assume they must contain recurring information.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 01, 2023 Nov 01, 2023

Copy link to clipboard

Copied

LATEST

That only works if you know what was redacted in the first place...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines