Copy link to clipboard
Copied
I've scanned a lengthy document which is mostly English but with a lot of Czech names in it. I'm amazed how accurate the OCR is. Mostly there are only missing letters (particularly 'N' and 's'), but also cases it inserted space or the OCR failed, and I want to re-type a word here and there.
However sometimes when I type 's' it immediately changes to 'š'. Or typing 'e' becomes 'é', or 'a' becomes 'á'. Copy paste has same issue.
I'm able to insert missing diacritics in Czech words/names, but I cannot seem to remove the ones that Acrobat has 'decided' belong there. I was able to fix one of these somehow (maybe saving document and restarting?) but it keeps happening randomly.
I've got a lot more editing to do so so far I am just noting where this happens so I can come back and try to fix them later.
Just wondering if anyone has seen this or has suggestions.
Copy link to clipboard
Copied
It looks like best way to deal with this might be to re-OCR the problem pages, fix the mistakes and replace the pages giving me editing problems. At least in my first go at this, there were a different set of errors to correct (it did a better job overall with the much smaller task), and I didn't have any trouble correcting them.
Copy link to clipboard
Copied
But still having random issues even with smaller documents: adds diacritics, changes style, and I have not yet discovered a way to avoid this.
Copy link to clipboard
Copied
Could you post a sample file? I know that Acrobat can't handle documents well that are multi-language. At least, OCR want's you to get fixed on a language, probably because of the dictionary used to correct the OCR. But I have never seen your issue. (I'm doing mixed language documents French, German, English, sometimes a fourth one).
Find more inspiration, events, and resources on the new Adobe Community
Explore Now