• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Which API would be best suited to check if two pdf files are the same?

New Here ,
Jul 17, 2024 Jul 17, 2024

Copy link to clipboard

Copied

I have a use case in which multiple PDF files are uploaded. I want to make sure no duplicate files get uploaded based on all the documents that have already been uploaded. What would be the best way to achieve this? Thank you.

Views

218

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 18, 2024 Jul 18, 2024

Copy link to clipboard

Copied

We don't have an API for that. However, the safest way to detect duplicate PDF files is to convert each page to a high resolution image then compare the images page by page. There are many tools that will help you do that. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 22, 2024 Jul 22, 2024

Copy link to clipboard

Copied

That sounds quite computationally expensive. Are there any other approaches that you might recommend? Also, for the above approach, any tools that. you recommend? Thank you @Joel_Geraci 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 29, 2024 Jul 29, 2024

Copy link to clipboard

Copied

LATEST

Just wanted to check in again @Joel_Geraci. Thanks

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources