Skip to main content
Participant
September 7, 2018
Question

ocr distorting graphs

  • September 7, 2018
  • 2 replies
  • 2827 views

Is there any way to OCR a selection on a page [as opposed to the whole page] since OCR is distorting some graphs and figures that it sees as text.

This topic has been closed for replies.

2 replies

Legend
September 10, 2018

I suggest you use OCR with hidden text.

gary_sc
Community Expert
Community Expert
September 7, 2018

Hi Cherylm,

Off the top of my head, no. But I've never seen a situation like you describe either.

But, please keep in mind that Acrobat itself cannot scan a page, it must rely upon either your scanner's software to do the scanning and/or some software that your computer has with it to do the scanning. After an image has been created, then Acrobat will go through it to OCR the page to turn it into searchable text.

For text that Acrobat cannot process though OCR, it will leave that material as an image.

So to properly process what is causing what, I suggest to bypass Acrobat here for a moment and locate your scanner's software and try to just scan the image itself. If that image is distorted, than the problem is not Acrobat. If it's just fine, than we can start to look at Acrobat.

And, to save us an extra trip to these forums, what is your OS (and version) and which version of Acrobat are you using?

Thanks,

Participant
September 7, 2018

Gary,

Thank you so much for your reply.  To answer your questions, Windows 10 and Acrobat DC.

If this helps explain the problem a bit more, here is my workaround, which I may have to continue to do, but I thought I would at least check here to see if there is a way I can OCR just selected areas of a page.

I can scan the page and do a save as generating a second copy of the scan.  I OCR one of the copies, then delete the [many] boxes Acrobat has generated to represent the [now distorted] graph.  I go back to the copy that still remains an image, crop the page down to just the graph, and export it to a png format.  I go back into the OCR'd version and insert the png image where the I removed the OCR'd graph/figure/chart.  Time consuming to say the least, but again, it may be my only option.  This doesn't happen with every graph, but every graph now has to be checked to see if it was distorted, because the distortion completely changes what the data should show.

Thank you again for all your help!

gary_sc
Community Expert
Community Expert
September 7, 2018

Hi Cherylm,

There are two ways to solve a computer problem/issue: (1) do what you have to do to get the job done, or (2) figure out the problem, fix what needs to be fixed, and then get the job done.

Obviously you've found a #1 method to get the job done but at this point I am curious as to what's going on there.

Is there any chance you could post a scan of a page in question and also what one of the evil-OCR samples of that page.

At a minimum, if I run the original scan to create the OCR version and do not get the problem, it means there is something wrong your copy of Acrobat or with the way that Acrobat is running on your system.

In an ideal world, one should not have to do what you are doing. But since we are working with computer, all bets are off. ;>)