Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
1

Unwanted diacritics when editing text

Community Beginner ,
Mar 23, 2024 Mar 23, 2024

I've scanned a lengthy document which is mostly English but with a lot of Czech names in it. I'm amazed how accurate the OCR is. Mostly there are only missing letters (particularly 'N' and 's'), but also cases it inserted space or the OCR failed, and I want to re-type a word here and there.

However sometimes when I type 's' it immediately changes to 'š'. Or typing 'e' becomes 'é', or 'a' becomes 'á'. Copy paste has same issue.

I'm able to insert missing diacritics in Czech words/names, but I cannot seem to remove the ones that Acrobat has 'decided' belong there. I was able to fix one of these somehow (maybe saving document and restarting?) but it keeps happening randomly.

I've got a lot more editing to do so so far I am just noting where this happens so I can come back and try to fix them later.

Just wondering if anyone has seen this or has suggestions.

TOPICS
PDF , Scan documents and OCR
303
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 23, 2024 Mar 23, 2024

It looks like best way to deal with this might be to re-OCR the problem pages, fix the mistakes and replace the pages giving me editing problems. At least in my first go at this, there were a different set of errors to correct (it did a better job overall with the much smaller task), and I didn't have any trouble correcting them.  

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Mar 23, 2024 Mar 23, 2024

But still having random issues even with smaller documents: adds diacritics, changes style, and I have not yet discovered a way to avoid this.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 24, 2024 Mar 24, 2024
LATEST

Could you post a sample file? I know that Acrobat can't handle documents well that are multi-language. At least, OCR want's you to get fixed on a language, probably because of the dictionary used to correct the OCR. But I have never seen your issue. (I'm doing mixed language documents French, German, English, sometimes a fourth one).

ABAMBO | Hard- and Software Engineer | Photographer
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines