• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

ocr distorting graphs

Community Beginner ,
Sep 07, 2018 Sep 07, 2018

Copy link to clipboard

Copied

Is there any way to OCR a selection on a page [as opposed to the whole page] since OCR is distorting some graphs and figures that it sees as text.

TOPICS
Scan documents and OCR

Views

1.4K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 07, 2018 Sep 07, 2018

Copy link to clipboard

Copied

Hi Cherylm,

Off the top of my head, no. But I've never seen a situation like you describe either.

But, please keep in mind that Acrobat itself cannot scan a page, it must rely upon either your scanner's software to do the scanning and/or some software that your computer has with it to do the scanning. After an image has been created, then Acrobat will go through it to OCR the page to turn it into searchable text.

For text that Acrobat cannot process though OCR, it will leave that material as an image.

So to properly process what is causing what, I suggest to bypass Acrobat here for a moment and locate your scanner's software and try to just scan the image itself. If that image is distorted, than the problem is not Acrobat. If it's just fine, than we can start to look at Acrobat.

And, to save us an extra trip to these forums, what is your OS (and version) and which version of Acrobat are you using?

Thanks,

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Sep 07, 2018 Sep 07, 2018

Copy link to clipboard

Copied

Gary,

Thank you so much for your reply.  To answer your questions, Windows 10 and Acrobat DC.

If this helps explain the problem a bit more, here is my workaround, which I may have to continue to do, but I thought I would at least check here to see if there is a way I can OCR just selected areas of a page.

I can scan the page and do a save as generating a second copy of the scan.  I OCR one of the copies, then delete the [many] boxes Acrobat has generated to represent the [now distorted] graph.  I go back to the copy that still remains an image, crop the page down to just the graph, and export it to a png format.  I go back into the OCR'd version and insert the png image where the I removed the OCR'd graph/figure/chart.  Time consuming to say the least, but again, it may be my only option.  This doesn't happen with every graph, but every graph now has to be checked to see if it was distorted, because the distortion completely changes what the data should show.

Thank you again for all your help!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 07, 2018 Sep 07, 2018

Copy link to clipboard

Copied

Hi Cherylm,

There are two ways to solve a computer problem/issue: (1) do what you have to do to get the job done, or (2) figure out the problem, fix what needs to be fixed, and then get the job done.

Obviously you've found a #1 method to get the job done but at this point I am curious as to what's going on there.

Is there any chance you could post a scan of a page in question and also what one of the evil-OCR samples of that page.

At a minimum, if I run the original scan to create the OCR version and do not get the problem, it means there is something wrong your copy of Acrobat or with the way that Acrobat is running on your system.

In an ideal world, one should not have to do what you are doing. But since we are working with computer, all bets are off. ;>)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

Good Morning Gary,

I'm not entirely sure how to post a pdf here, so I am inserting an example page, converted to jpeg, both pre OCR and post OCR.  You can see how badly the graph was distorted by running OCR on the page.  I'm happy to post a pdf if you tell me how.  And thank you again for all your help with this.  I'm hoping there will be a solution 2, to be able to figure out the problem, fix it and get the job done.  It will certainly save me a lot of work!  Cheryl

Page_17_Image.jpgPage_17_OCR.jpg

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

You can use Send & Track and post the link.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

Thank you Bernd, I'd never used that feature before.  If I did it correctly, which is most definitely an if, the first file should be the page image before OCR.  The second is the page after OCR.

Thank you again!

Cheryl

Shared Files - Acrobat.com

Shared Files - Acrobat.com

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

Hi Cheryl,

Well good news, bad news: When I ran your PDF I did not get the problem you clearly showed. As such I cannot verify what the problem may be.

One thing you can try is to check your settings:

When Enhanced Scans tool is selected, click on Enhance and then go to the Settings (to the right of Recognize Text).

2018-09-10_08-59-58.png

From here I'd set several of the parameter as follows:

Go down to the bottom and click on the Edit for Text Recognition. Then in the drop down select "Editable Text And Images."

See if that works.

[On the bottom of this text is a screenshot showing that I've got searchable text and your graphic is not smudged.

Acrobat Pro DCSc-002.png

2018-09-10_08-27-38.png

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

LATEST

Hi Gary,

Thank you for this.  While I tried it, and it didn't work for me, the process did lead me to what I believe is the resolution.  I realized that Acrobat began running text recognition when I opened up the Edit PDF Tool(s).  It was at this stage the distortion was occurring. When I told the program to revert to image, and then ran text recognition using the Enhance Scan Tools after I finished with my cropping, I got a clean OCR with no distortion.  Interestingly enough, it only worked with the Recognize Text option, if I chose the Edit Text and Pictures setting the graphs would distort again.  A big thank you to everyone who responded, while I may have gotten my answer a bit circuitously, I have no doubt that without your help I would still be pasting in images.  Your collective expertise (and promptness!) is greatly appreciated.  Best, Cheryl

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Sep 10, 2018 Sep 10, 2018

Copy link to clipboard

Copied

I suggest you use OCR with hidden text.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines