Can you teach Acrobat OCR how to recognize certain symbols?

Community Beginner ,
Jun 29, 2021 Jun 29, 2021

Copy link to clipboard

Copied

There are some texts that I need to scan that include old letters and symbols which Acrobat cannot recognize. Is it possible to teach the OCR software how to interpret certain symbols? It does recognize them as separate letters, you can highlight and copy them, but it does not know what letter it corresponds to in the language you are scanning to. 

TOPICS
Scan documents and OCR

Views

108

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 29, 2021 Jun 29, 2021

Copy link to clipboard

Copied

Maybe with the "Find First Suspect" or "Find All Suspects" commands...

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 29, 2021 Jun 29, 2021

Copy link to clipboard

Copied

That is very manual. I was hoping for something more automated. 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jun 29, 2021 Jun 29, 2021

Copy link to clipboard

Copied

Not sure I understand, what kind of symbols are these and what would you like them converted to?

 

For example: lets say the symbols were signs of the zodiac. Would you want/expect the symbol of Mars to be translated into the word Mars?

 

Can you provide a screenshot of one or some of these symbols?

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 29, 2021 Jun 29, 2021

Copy link to clipboard

Copied

Here are some examples:

ы
ѫ
Ѣ

Those are letters that are no longer part of the Bulgarian language, I'd like to be able to tell Acrobat to which unicode character/code I want them to correspond during recognition, so I don't have to manually replace the random weird combination of characters the software inputs in its place. 

 

Is there such an option? Acrobat does recognise them visually as separate symbols, but it doesn't know what to do with them. 

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 14, 2022 Feb 14, 2022

Copy link to clipboard

Copied

What about things like the suit symbols?  I frequently scan documents that have those in them; it would be a big time saver and improve accuracy if I didn't have to manually edit the document to insert those.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 11, 2022 Aug 11, 2022

Copy link to clipboard

Copied

LATEST

Did you ever get an answer? I'm stuck on the same problem, but with unicode symbols for ˤ / ġ / ḫ / ḥ / š / ṣ / ṯ / ṭ and ẓ in transcribed Ugaritic. The frequency of these unrecognized letters makes even Adobe's OCR almost worthless, I have to issue so many corrections...

 

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines