• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Why so many OCR errors

Explorer ,
Oct 31, 2016 Oct 31, 2016

Copy link to clipboard

Copied

Even in a very clear image of English text in a sans-serif font such as Helvetica, OCR produces numerous artifacts and recognition errors. And many of those don't show up as 'suspects' and may not be visible in Edit mode — only when exported as text. Results are way below what I could get from separate OCR 10 years ago.

How can I get more usable text recognition?

TOPICS
Acrobat SDK and JavaScript

Views

981

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Adobe Employee , Nov 08, 2016 Nov 08, 2016

If you run OCR using 'Editable text & Images', it won't show any suspect. You can go to edit PDF tool and change any word.

Otherwise after running OCR, click 'Review recognize text' checkbox. Now you can make any word as a suspect by double clicking on it. Enhance Scan/Recognize Text>Correct recognize text> Review recognize text.

Thanks.

Votes

Translate

Translate
Adobe Employee ,
Nov 02, 2016 Nov 02, 2016

Copy link to clipboard

Copied

We apologize for the issue you are facing. Can you please share following information to help us identify and resolve the issue ASAP:

- Acrobat version you are using

- Operating system

- OCR method

- 1 sample PDF file where you are facing this issue(you can use https://cloud.acrobat.com/send  for sharing)

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Nov 02, 2016 Nov 02, 2016

Copy link to clipboard

Copied

Uploaded a couple of samples to

files.acrobat.com/a/preview/e18a6014-1494-48b5-8322-366d0571c5b5 <https://files.acrobat.com/a/preview/e18a6014-1494-48b5-8322-366d0571c5b5>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Nov 02, 2016 Nov 02, 2016

Copy link to clipboard

Copied

Acrobat Pro DC

Architecture: x86_64

Build: 15.20.20039.203716

AGM: 4.30.66

CoolType: 5.14.5

JP2K: 1.2.2.37137

Currently running Mac OS Sierra 10.12.2 Beta (16C32e); problem first noticed on Sierra 10.12.1

Original method was just to Enhance Scan/Recognize Text, but it was difficult to capture the recognized text. Then tried just opening the PDF and doing File/Export to RTF. The results are better this way.

I discarded the earlier problem files but will post a couple of less-extreme examples.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Nov 04, 2016 Nov 04, 2016

Copy link to clipboard

Copied

Thanks for sharing the files. If I am not wrong, issue you are talking about is two words overlapped after recognizing text.

Enhance Scan/Recognize Text>Correct recognize text> Review recognize text.Overlapped text.png

Please use Editable text & Image once. Also specify the settings you are using.

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Nov 04, 2016 Nov 04, 2016

Copy link to clipboard

Copied

Sometimes. It also involves characters within a word being overlapped and mis-recognized characters.

When I try the 'overlapped text.png’ and try to review the text, it says there are no suspects. What settings do you want to know?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Employee ,
Nov 08, 2016 Nov 08, 2016

Copy link to clipboard

Copied

If you run OCR using 'Editable text & Images', it won't show any suspect. You can go to edit PDF tool and change any word.

Otherwise after running OCR, click 'Review recognize text' checkbox. Now you can make any word as a suspect by double clicking on it. Enhance Scan/Recognize Text>Correct recognize text> Review recognize text.

Thanks.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Nov 08, 2016 Nov 08, 2016

Copy link to clipboard

Copied

Thank you — this is very helpful!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 02, 2024 Jul 02, 2024

Copy link to clipboard

Copied

LATEST

Hello, Community!   Recently, I started to experience difficulties with the Read-Out Loud function, after having processed my PDF-file with OCR.  I am still using Acrobat Pro, v. 9.   What I am experiencing is that for a given block of recognized text, I will hear a line being read correctly, and then the same words, repeated, indiviually, at a much slower speed, with occational gibberish thrown in for good measure!

W#hat in the world has happened to the OCR process with Acrobat Pro v. 9?   Curious minds would like to know!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jun 30, 2024 Jun 30, 2024

Copy link to clipboard

Copied

image *of* English text? what is that supposed to mean? idk about you, but I'm just trying to convert an image of this weird duck lookin thing to text, it doesn't have any text because why would it have text?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines