• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Combining PDFs and sorting similar pages

Engaged ,
Feb 19, 2022 Feb 19, 2022

Copy link to clipboard

Copied

I have two PDF files and contents are partly overlapping, i.e. (quite a) number of pages are both PDF files.
I might combine both files into a 3rd one.
But, from that point on, how would I need to proceed to have duplicate pages removed?
Is there some way of sorting, so duplicates are 'grouped' can be deleted in a more convenient way, rather than scrolling thru entire PDF back and forth.

These are fairly large (150-200 pages) PDF files from scanned documents.
Acrobat Pro 2020

 

Thanks.

TOPICS
Windows

Views

339

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

LEGEND , Feb 19, 2022 Feb 19, 2022

Detecting duplicates is a major challenge. Can't see a simple way.

ALSO, they won't actually be duplicates; the scanning might be 0.1 mm different and rotated by 0.5 degree, which would make the contents of the page absolutely different (as a graphic).

Votes

Translate

Translate
LEGEND ,
Feb 19, 2022 Feb 19, 2022

Copy link to clipboard

Copied

Detecting duplicates is a major challenge. Can't see a simple way.

ALSO, they won't actually be duplicates; the scanning might be 0.1 mm different and rotated by 0.5 degree, which would make the contents of the page absolutely different (as a graphic).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Feb 19, 2022 Feb 19, 2022

Copy link to clipboard

Copied

Thank you.

Bad luck then. I was already afraid for this.
It was a 'long shot'.  I was hoping that, based on OCR/Page recognition, keywords in pages, pages could be indexed and sorted based on some sort of similarity.

Anyway, thanks again.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 20, 2022 Feb 20, 2022

Copy link to clipboard

Copied

LATEST

If the results of the OCR are relatively good then it should be possible.

Can you share a sample file with us?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines