Copy link to clipboard
Copied
What is the difference between 'Recognize text' and Export (to Word)? I want to convert a tiff file of a newspaper article In Hungarian into text that I can run thro' a translator.
If I open the tiff in Acrobat, Enhance text and "Recognize text' all I get is the same view - what do I do then?
If I export to Word all I get is an image of the text in a Word file?
What is the appropriate method please?
Copy link to clipboard
Copied
The "recognize text" function will add text that you can search, select, copy and sometimes even edit to your PDF document. It does not create any other document or document format along the way, whereas export to Word (or Excel or any of the other options that Acrobat provides) will take your PDF document and convert it - as best as possible - to another file format.
In your case, it depends on what your translation application accepts as input format. If you can copy and paste into the application, then recognize text, and selecting e.g. the contents of one page and copying that information in Acrobat, and then pasting into your application will be sufficient. If you can import a Word document, or a text file, then exporting would probably be a better option. Keep in mind that depending on the version of Acrobat you are running, you may still have to recognize text first before you export. Otherwise you may end up with a Word document that only contains your scanned images. Acrobat DC allows you to select to run OCR at export time when necessary.
Copy link to clipboard
Copied
Thanks Karl, It is easy for me to print the tiff file, scan to OCR and work from that. This I have done for a long time.
However, if I have read the claims for it correctly, I should be able to open a high quality tiff file of a newspaper article in Acrobat and then produce text that I can do more or less what I like with. Particularly important here would be taking advantage of the high quality of the information in the tiff to get fewer errors in converting to text for use in translation.
Anyway, carefully following Adobe's instructions Turn paper documents into searchable PDFs |
I cannot get my tiff file, when opened in Acrobat DC (it actually calls itself Acrobat Pro DC), to do anything other than be a pdf file with an image of text. After enhancing the scan (or photograph?) and recognizing text I don't have what I would call editable text at all. And when I export it to MS Word, all I get is a picture inserted into a Word document. Indeed when I try and 'find' I get the alert 'This page contains only an image. Would you like to run text recognition ....? Answering 'yes', I then get 'Adobe Acrobat has finished searching the document. No matches were found', which is nonsense because the word A is right there.
So I am not getting anywhere and if you have any ideas they would be welcome. I tried several versions of Adobe support but they have been no use.
PS I also sometimes get a message like "Some text was not recognized"
Copy link to clipboard
Copied
It works for us on a Mac. Same file on a PC - no way!
Copy link to clipboard
Copied
Please try following steps: Enhance Scan> "Recognize text' > In this file> Recognize text
You can changes settings before running text recognition like language and "Searchable Image/Editable Text & Images" as OCR type.
If you still face any issue, please share some sample files and Acrobat version you are using.
Thanks.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now