OCR does not recognize language correctly
We downloaded a trial version of Acrobat DC to see if we could use it to convert docs to PDF/A for records management and archiving purposes.
Some of the documents are just scans, and need OCR first. As we want to add metadata based on content, we have to open and treat each document individually. Acrobat DC immediately starts OCR-conversion without asking, on the assumption that:
- we want the document to be OCR'd (correct)
- it can identify the language of the document itself.
Well that second assumption is wrong: All documents we have tested are identified as being written in Dutch, whereas some are actually in French and even in English (incredible but true). So for every document we have to wait for the first OCR to complete and then have a rerun where we correct the language settings - which is extremely time consuming.
Is there a way to prevent OCR conversion starting automatically and have it run only after defining yourself what the language of the document is?
