Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Removing duplicate pages in PDFs?

Community Beginner ,
Mar 14, 2017 Mar 14, 2017

I often need to convert emails to pdfs in my job, because sometimes the email contains certain information that needs to be redacted, such as student information.  I work for a university.  When the email being requested is pulled from several different accounts there is often the same email that appears in each account.  Is there a way to remove the duplicates before I start to review and redact, so I don't have to keep making the same redactions over and over?

Thanks for any help.

TOPICS
Acrobat SDK and JavaScript
541
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Mar 15, 2017 Mar 15, 2017

Moving your discussion to see if anyone in the JavaScript area knows of a way to do this.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 15, 2017 Mar 15, 2017
LATEST

This might be possible. Here is how I would approach this:

You cannot get access to the actual PDF content on a page, all you can do is iterate over all "words" on a page. What Acrobat considers a word may not be identical to your interpretation in all cases. You could then create a "checksum" for all pages in your document and then try to identify pages that result in the same checksum. Depending on how you create this checksum, you may then still have to compare the pages word by word to make sure you are dealing with an exact duplicate. You would then mark the duplicate page as one that needs to be deleted, and in a final step, delete the pages from the end of the document.

If you need help with any of these steps, that's what I do for a living

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines