There are some texts that I need to scan that include old letters and symbols which Acrobat cannot recognize. Is it possible to teach the OCR software how to interpret certain symbols? It does recognize them as separate letters, you can highlight and copy them, but it does not know what letter it corresponds to in the language you are scanning to.
Maybe with the "Find First Suspect" or "Find All Suspects" commands...
That is very manual. I was hoping for something more automated.
Not sure I understand, what kind of symbols are these and what would you like them converted to?
For example: lets say the symbols were signs of the zodiac. Would you want/expect the symbol of Mars to be translated into the word Mars?
Can you provide a screenshot of one or some of these symbols?
Here are some examples:
ы ѫ Ѣ
Those are letters that are no longer part of the Bulgarian language, I'd like to be able to tell Acrobat to which unicode character/code I want them to correspond during recognition, so I don't have to manually replace the random weird combination of characters the software inputs in its place.
Is there such an option? Acrobat does recognise them visually as separate symbols, but it doesn't know what to do with them.
What about things like the suit symbols? I frequently scan documents that have those in them; it would be a big time saver and improve accuracy if I didn't have to manually edit the document to insert those.
Did you ever get an answer? I'm stuck on the same problem, but with unicode symbols for ˤ / ġ / ḫ / ḥ / š / ṣ / ṯ / ṭ and ẓ in transcribed Ugaritic. The frequency of these unrecognized letters makes even Adobe's OCR almost worthless, I have to issue so many corrections...