Copy link to clipboard
Copied
There are some texts that I need to scan that include old letters and symbols which Acrobat cannot recognize. Is it possible to teach the OCR software how to interpret certain symbols? It does recognize them as separate letters, you can highlight and copy them, but it does not know what letter it corresponds to in the language you are scanning to.
Copy link to clipboard
Copied
Maybe with the "Find First Suspect" or "Find All Suspects" commands...
Copy link to clipboard
Copied
That is very manual. I was hoping for something more automated.
Copy link to clipboard
Copied
Not sure I understand, what kind of symbols are these and what would you like them converted to?
For example: lets say the symbols were signs of the zodiac. Would you want/expect the symbol of Mars to be translated into the word Mars?
Can you provide a screenshot of one or some of these symbols?
Copy link to clipboard
Copied
Here are some examples:
Ñ‹
Ñ«
Ñ¢
Those are letters that are no longer part of the Bulgarian language, I'd like to be able to tell Acrobat to which unicode character/code I want them to correspond during recognition, so I don't have to manually replace the random weird combination of characters the software inputs in its place.
Is there such an option? Acrobat does recognise them visually as separate symbols, but it doesn't know what to do with them.
Copy link to clipboard
Copied
What about things like the suit symbols? I frequently scan documents that have those in them; it would be a big time saver and improve accuracy if I didn't have to manually edit the document to insert those.
Copy link to clipboard
Copied
Did you ever get an answer? I'm stuck on the same problem, but with unicode symbols for ˤ / ġ / ḫ / ḥ / š / ṣ / ṯ / Ṡand ẓ in transcribed Ugaritic. The frequency of these unrecognized letters makes even Adobe's OCR almost worthless, I have to issue so many corrections...
Copy link to clipboard
Copied
More commonly, what about scientific and mathematical symbols? Can you train or set Adobe to OCR these characters porperly and automatically?