Skip to main content
Participating Frequently
August 17, 2020
Question

Acrobat Pro DC comparison issue

  • August 17, 2020
  • 5 replies
  • 9124 views

Hi, 

I am having trouble in comparing PDF files.

 

First, some details about my OS and Acrobat software:-

- Windows 10 Pro, i5, 16GB RAM, 64-bit

- Adobe Acrobat Pro DC, version 2020.012,20041

 

I am trying to compare the first and second versions of a legal contract.  The only difference between the first and second versions of this contract are some addition / deletion of words / paragraphs.  After I ran the comparison, the results only showed: whole page deleted and whole page inserted (which shouldn't be right, as only minor changes were made to the text).

 

Some information about the first and second versions of the document:-

 

- First Version:  this document was sent to me by e-mail by ABC Ltd.  Under properties, it shows that the PDF producer is "Adobe Acrobat Pro DC 19.21.20049".  When I open this first version, I am able to select and copy text.

 

- Second Version:  ABC Ltd printed this document, signed it, and delivered the hardcopy to me.  I scanned this into my PC using my office scanner (RICOH).  Under properties, it shows that the PDF producer is "Adobe PSL 1.3e for Canon".  When I scanned this document, I adjusted the settings, so that the scanned document would be OCR.  When I open this second version, I am able to select and copy text.

 

Further, information:-

- AfterI pressed the "Compare Files" button, and selected both Old and New Files, a yellow interrogation mark would appear under the New File, and when I put my mouse cursor over it, the following message would appear:  "Selected document is a scanned PDF and contains no text.  Acrobat will perform image to image comparison only."

- I have tried printing this Second Version using Acrobat by selecting "Adobe PDF" as the printer.  Let's call this print out as "new Second Version.  Then, I tried running a comparison between the First Version and the new Second Version, but still have the same problem mentioned above.

 

Please help.  Checking and comparing documents is big part of my daily job, and this is the main reason I purchased Acrobat Pro DC.  Thank you very much

This topic has been closed for replies.

5 replies

Participating Frequently
August 21, 2020

Update:

 

After numerous rounds of attemps and hours of research and asking around, I have been getting better comparison results by further adjusting scan settings. 

 

Just one more question:  The New File is a scanned document with binding / punched holes, whereas the Old File has no binding / punched holes.  So, when I run a comparison between the old and new file, the holes are detected as text changes.  This is confusing as there are 21 holes on each page, and there are 50 pages, and the comparison would show there are 1,050 text changes.

 

Thank you.

Adobe Employee
August 21, 2020

Hi,

 

Are you getting the punch holes as differences in 'Compare Text only' mode as well?

Using the 'Filter' option on the Compare app toolbar might help you in this regard in filtering/ignoring the various types of differences. 

 

PS: The better the quality of the scanned file, the better will be the OCR output. And that should improve the quality of results as well.

 

Regards

Adobe Acrobat DC Team

Participating Frequently
November 18, 2020

Was this ever resolved? I am going through the same challenge. I have an original lease document, and a scanned and signed (signed with ink) copy that was sent back to me. I need to compare 45 pages but it just shows every page as deleted or inserted.


Hi, I have found a way around.  It's not the best solution, but need to wait until Adobe makes further improvement to the software.

 

For the scanned and signed copy, make sure that you scan the document using the highest resolution possible.  For my scanner, the highest was 600*600 dpi.  If your scanner has the OCR function, do not use it.  Given the large size of the file (as you used highest resolution), the pdf would come out in separate files.  So, you need to combine / merge these PDF files first.  After merging, run OCR using Adobe.  Then, do the comparison.  You should be able to compare and find difference.  I found that the result was about 95% accurate.  If the PDF file has handwritten words / diagrams / pictures, the result accuracy would drop to about 90%.  If the PDF scan has punched holes, the result accuracy would drop down dramatically.  Very tedious process involving a few extra steps, but better than comparing the whole document manually.

 

There are actually a few articles (that you can google) discussing in great detail about how (1) scan resolution, and (2) font style of a PDF document would affect the comparison result.  It would be nice if Adobe could include these stuffs in the forums / FAQ.

 

Hope this helps, and would be great if you could share better method !

Adobe Employee
August 19, 2020

Hi,

Could you please try comparing the using option shown below in the screenshot?

Let us know if you're facing the same issue.

Regards

Adobe Acrobat DC Team

Participating Frequently
August 19, 2020

Hi aakash4,

 

I have tried your settings, and it still doesn't work.  I cannot share the whole file (as it contain confidential information).  But I can share one page of the New File and Old File:-

 

1. Old File

 

2. New File 

 

Thank you.

Adobe Employee
August 20, 2020

Hi,

 

Have you tried comparing the files using 'Compare Text only' mode?

 

Regards

Adobe Acrobat DC Team

Legend
August 19, 2020

Good job, to spot the scammer. This message "Selected document is a scanned PDF and contains no text. " seems clear. There IS no text, OCR was not done or AT LEAST ONE of the documents. If you think the message is wrong, please check by trying to select and copy text in BOTH files. OCR must be done, well and correctly, if you want to compare text.

Participating Frequently
August 19, 2020

Thank you for your reply.

 

I am able to select and copy texts in both files.  I believe the problem is with the New / Second File (which was scanned).  But I believe that the scan was done properly.  I set OCR to precision mode, and it took 25 minutes to for the scan / OCR to complete. 

 

I have even tried doing (1) enhance scan, and (2) text recoginition on the both files, and tried running the comparison again.  But still same problem.  The comparison couldn't detect any text.

Bernd Alheit
Community Expert
Community Expert
August 19, 2020

In Acrobat try the OCR option "Editable Text and Images" 

Bernd Alheit
Community Expert
Community Expert
August 18, 2020

What OCR settings does you use?

Participating Frequently
August 18, 2020

I can't remember.  I can check tomorrow to confirm.  But I had two options:  (1) OCR fast, and (2) OCR precision.  I chose OCR precision, and it took about 25 minutes for the scan to complete.  The document had 50 pages.

Participating Frequently
August 18, 2020

Apparently, I was approached by a scammer after making the above post, asking me to contact the e-mail below for adobe customer support:-

 

Adobecare.Experts@protonmail.com