Why so many OCR errors

Question

Even in a very clear image of English text in a sans-serif font such as Helvetica, OCR produces numerous artifacts and recognition errors. And many of those don't show up as 'suspects' and may not be visible in Edit mode — only when exported as text. Results are way below what I could get from separate OCR 10 years ago.

How can I get more usable text recognition?

Lovekesh Garg · Accepted Answer

Sometimes. It also involves characters within a word being overlapped and mis-recognized characters.

When I try the 'overlapped text.png’ and try to review the text, it says there are no suspects. What settings do you want to know?

If you run OCR using 'Editable text & Images', it won't show any suspect. You can go to edit PDF tool and change any word.

Otherwise after running OCR, click 'Review recognize text' checkbox. Now you can make any word as a suspect by double clicking on it. Enhance Scan/Recognize Text>Correct recognize text> Review recognize text.

Thanks.

IROP · Answer

image *of* English text? what is that supposed to mean? idk about you, but I'm just trying to convert an image of this weird duck lookin thing to text, it doesn't have any text because why would it have text?

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.