Obtain a text file without using a language
Copy link to clipboard
Copied
I want to obtain a text file for the included file. I don't want to use a specific language.
Copy link to clipboard
Copied
Convert to PDF (using Acrobat, not Reader), run Text Recognition, save as a text file.
Copy link to clipboard
Copied
The recognition does not identify spaces correctly, as shown in the atached Word file. It appears that the spaces are geometrically identical, but the Acrobat OCR does not use that in making the text file!
Copy link to clipboard
Copied
The puzzle you are trying to convert to text comes from copyrighted material.
You need to get permission from the magazine to copy their puzzle. We cannot assist.
Copy link to clipboard
Copied
I understand. I am a subscriber to this magazine, so I believe that I can use their content to solve their cryptogram using a program I developed. I can manually type the text shown in the uploaded .pdf file to a word processing program, but I want to avoid this by having a stupid OCR program (one that doesn't use a language) do it for me.
Copy link to clipboard
Copied
This source material is very challenging for OCR, since the the letters have a very wide spacing. I don't think you can avoid, that this will be recognised as separate space characters. Especially not because there are almost no known words in this text to be regognised.
Copy link to clipboard
Copied
OCR needs a language because it has to correct errors, having diduved into words, then using a dictionary. OCR of separate letters rarely works.
Copy link to clipboard
Copied
I was using VMWare Fuson to access Windows XP. Xp had the App Textbridge Pro Millenium V9.5. I used this app successfully to read the cryptogram pages and make a text file to Word that had very few errors in the OCR process. One of Textbridge's options was to have it "learn" a language from its scan. I didn't choose a language, so it just gave me the encrypted text with accurately determined spaces.
I was using a Epson scanner with this process.
I recently upgraded my MAC OS from 10.14 to 10.15 to utilize 64 bit code. I cannot get the XP program to recognize the Epson scanner now.
Nor can I find a Macintosh OCR that is as functional as Textbridge Pro.

