• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
1

Unwanted diacritics when editing text

Community Beginner ,
Mar 23, 2024 Mar 23, 2024

Copy link to clipboard

Copied

I've scanned a lengthy document which is mostly English but with a lot of Czech names in it. I'm amazed how accurate the OCR is. Mostly there are only missing letters (particularly 'N' and 's'), but also cases it inserted space or the OCR failed, and I want to re-type a word here and there.

However sometimes when I type 's' it immediately changes to 'š'. Or typing 'e' becomes 'é', or 'a' becomes 'á'. Copy paste has same issue.

I'm able to insert missing diacritics in Czech words/names, but I cannot seem to remove the ones that Acrobat has 'decided' belong there. I was able to fix one of these somehow (maybe saving document and restarting?) but it keeps happening randomly.

I've got a lot more editing to do so so far I am just noting where this happens so I can come back and try to fix them later.

Just wondering if anyone has seen this or has suggestions.

TOPICS
PDF , Scan documents and OCR

Views

93

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 23, 2024 Mar 23, 2024

Copy link to clipboard

Copied

It looks like best way to deal with this might be to re-OCR the problem pages, fix the mistakes and replace the pages giving me editing problems. At least in my first go at this, there were a different set of errors to correct (it did a better job overall with the much smaller task), and I didn't have any trouble correcting them.  

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 23, 2024 Mar 23, 2024

Copy link to clipboard

Copied

But still having random issues even with smaller documents: adds diacritics, changes style, and I have not yet discovered a way to avoid this.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 24, 2024 Mar 24, 2024

Copy link to clipboard

Copied

LATEST

Could you post a sample file? I know that Acrobat can't handle documents well that are multi-language. At least, OCR want's you to get fixed on a language, probably because of the dictionary used to correct the OCR. But I have never seen your issue. (I'm doing mixed language documents French, German, English, sometimes a fourth one).

ABAMBO | Hard- and Software Engineer | Photographer

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines