• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
2

Acrobat Pro DC comparison issue

Community Beginner ,
Aug 17, 2020 Aug 17, 2020

Copy link to clipboard

Copied

Hi, 

I am having trouble in comparing PDF files.

 

First, some details about my OS and Acrobat software:-

- Windows 10 Pro, i5, 16GB RAM, 64-bit

- Adobe Acrobat Pro DC, version 2020.012,20041

 

I am trying to compare the first and second versions of a legal contract.  The only difference between the first and second versions of this contract are some addition / deletion of words / paragraphs.  After I ran the comparison, the results only showed: whole page deleted and whole page inserted (which shouldn't be right, as only minor changes were made to the text).

 

Some information about the first and second versions of the document:-

 

- First Version:  this document was sent to me by e-mail by ABC Ltd.  Under properties, it shows that the PDF producer is "Adobe Acrobat Pro DC 19.21.20049".  When I open this first version, I am able to select and copy text.

 

- Second Version:  ABC Ltd printed this document, signed it, and delivered the hardcopy to me.  I scanned this into my PC using my office scanner (RICOH).  Under properties, it shows that the PDF producer is "Adobe PSL 1.3e for Canon".  When I scanned this document, I adjusted the settings, so that the scanned document would be OCR.  When I open this second version, I am able to select and copy text.

 

Further, information:-

- AfterI pressed the "Compare Files" button, and selected both Old and New Files, a yellow interrogation mark would appear under the New File, and when I put my mouse cursor over it, the following message would appear:  "Selected document is a scanned PDF and contains no text.  Acrobat will perform image to image comparison only."

- I have tried printing this Second Version using Acrobat by selecting "Adobe PDF" as the printer.  Let's call this print out as "new Second Version.  Then, I tried running a comparison between the First Version and the new Second Version, but still have the same problem mentioned above.

 

Please help.  Checking and comparing documents is big part of my daily job, and this is the main reason I purchased Acrobat Pro DC.  Thank you very much

TOPICS
How to , Scan documents and OCR

Views

7.2K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 17, 2020 Aug 17, 2020

Copy link to clipboard

Copied

Apparently, I was approached by a scammer after making the above post, asking me to contact the e-mail below for adobe customer support:-

 

Adobecare.Experts@protonmail.com

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 17, 2020 Aug 17, 2020

Copy link to clipboard

Copied

What OCR settings does you use?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 17, 2020 Aug 17, 2020

Copy link to clipboard

Copied

I can't remember.  I can check tomorrow to confirm.  But I had two options:  (1) OCR fast, and (2) OCR precision.  I chose OCR precision, and it took about 25 minutes for the scan to complete.  The document had 50 pages.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Hi Bernd Alheit, I have checked my office scanner.  I used the following OCR settings:-

 

PDF OCR (prioritize precision)

Colour: black & white to grayscale

resolution: 300 x 300 dpi

copy ratio: 100%

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Good job, to spot the scammer. This message "Selected document is a scanned PDF and contains no text. " seems clear. There IS no text, OCR was not done or AT LEAST ONE of the documents. If you think the message is wrong, please check by trying to select and copy text in BOTH files. OCR must be done, well and correctly, if you want to compare text.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Thank you for your reply.

 

I am able to select and copy texts in both files.  I believe the problem is with the New / Second File (which was scanned).  But I believe that the scan was done properly.  I set OCR to precision mode, and it took 25 minutes to for the scan / OCR to complete. 

 

I have even tried doing (1) enhance scan, and (2) text recoginition on the both files, and tried running the comparison again.  But still same problem.  The comparison couldn't detect any text.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

In Acrobat try the OCR option "Editable Text and Images" 

Bild1.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Hi Bernd Alheit,

 

I have tried your settings above as well, and it still doesn't work.  I cannot share the whole file (as it contain confidential information).  But I can share one page of the New File and Old File.  Can you try comparing these two files from your side?  Thanks.

 

1. Old File

 

2. New File 

 

Thank you.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Hi,

Could you please try comparing the using option shown below in the screenshot?

Let us know if you're facing the same issue.

aakash4_0-1597835509187.png

Regards

Adobe Acrobat DC Team

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Hi aakash4,

 

I have tried your settings, and it still doesn't work.  I cannot share the whole file (as it contain confidential information).  But I can share one page of the New File and Old File:-

 

1. Old File

 

2. New File 

 

Thank you.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Aug 19, 2020 Aug 19, 2020

Copy link to clipboard

Copied

Hi,

 

Have you tried comparing the files using 'Compare Text only' mode?

 

Regards

Adobe Acrobat DC Team

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 20, 2020 Aug 20, 2020

Copy link to clipboard

Copied

Update:

 

After numerous rounds of attemps and hours of research and asking around, I have been getting better comparison results by further adjusting scan settings. 

 

Just one more question:  The New File is a scanned document with binding / punched holes, whereas the Old File has no binding / punched holes.  So, when I run a comparison between the old and new file, the holes are detected as text changes.  This is confusing as there are 21 holes on each page, and there are 50 pages, and the comparison would show there are 1,050 text changes.

 

Thank you.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Aug 20, 2020 Aug 20, 2020

Copy link to clipboard

Copied

Hi,

 

Are you getting the punch holes as differences in 'Compare Text only' mode as well?

Using the 'Filter' option on the Compare app toolbar might help you in this regard in filtering/ignoring the various types of differences. 

 

PS: The better the quality of the scanned file, the better will be the OCR output. And that should improve the quality of results as well.

 

Regards

Adobe Acrobat DC Team

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 20, 2020 Aug 20, 2020

Copy link to clipboard

Copied

Hi, 

 

Yes, I selected "Compare Text only".  The compare function treats the binding / punched holes as letters "I", "L", "t", "i".

 

By "Filter", do you mean the "Settings" where I check various boxes under "Show in Report"?  If that's what you are referring to, I checked the box "text" only.  Further, I don't see any filter for ignoring punched holes.

 

Please advise.  Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Aug 23, 2020 Aug 23, 2020

Copy link to clipboard

Copied

Hi,

 

I'm talking about the 'Filter' menu on the Compare toolbar. Please refer the attached screenshot below:

AakashDeep_0-1598247608993.png

Also, could you please share some sample files with me so that I can investigate further?

 

Please follow the steps to share the file using Adobe send - https://cloud.acrobat.com/send

  • Open this link.
  • Click on “Select files to Send”.
  • Click the link "Choose file from my computer" and select the file.
  • Click on "Add Files", if you have more files to add.
  • Click on Create Link.
  • Share this link with us.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 18, 2020 Nov 18, 2020

Copy link to clipboard

Copied

Was this ever resolved? I am going through the same challenge. I have an original lease document, and a scanned and signed (signed with ink) copy that was sent back to me. I need to compare 45 pages but it just shows every page as deleted or inserted.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Nov 18, 2020 Nov 18, 2020

Copy link to clipboard

Copied

Yes, it was resolved.  For PDF there are different types of comparison. The current comparison is telling you that you have two completely different documents, which is correct. What you want is to compare text, and only text. To do that, the scanned document must be OCR'd.  You'll need to check the text after OCR to make sure it's really legible text. Do this by copying and pasting some text into a different document.

Thom Parker - Software Developer at PDFScripting
Use the Acrobat JavaScript Reference early and often

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Nov 18, 2020 Nov 18, 2020

Copy link to clipboard

Copied

Hi, I have found a way around.  It's not the best solution, but need to wait until Adobe makes further improvement to the software.

 

For the scanned and signed copy, make sure that you scan the document using the highest resolution possible.  For my scanner, the highest was 600*600 dpi.  If your scanner has the OCR function, do not use it.  Given the large size of the file (as you used highest resolution), the pdf would come out in separate files.  So, you need to combine / merge these PDF files first.  After merging, run OCR using Adobe.  Then, do the comparison.  You should be able to compare and find difference.  I found that the result was about 95% accurate.  If the PDF file has handwritten words / diagrams / pictures, the result accuracy would drop to about 90%.  If the PDF scan has punched holes, the result accuracy would drop down dramatically.  Very tedious process involving a few extra steps, but better than comparing the whole document manually.

 

There are actually a few articles (that you can google) discussing in great detail about how (1) scan resolution, and (2) font style of a PDF document would affect the comparison result.  It would be nice if Adobe could include these stuffs in the forums / FAQ.

 

Hope this helps, and would be great if you could share better method !

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Nov 18, 2020 Nov 18, 2020

Copy link to clipboard

Copied

LATEST

Thanks for the reply, I will give this a try. It always seems worth it just to read the whole document page by page instead based on how much time this takes! 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines