Copy link to clipboard
Copied
Can anyone explain why the OCR would capture the word "Park," and the comma, but ignore every other word on the page? I find this all the more perplexing in that it read the barely legible reception number on the next page(didn't read it right, but I can still go edit it to be correct) but failed to capture any other words. Sometimes this is very frustrating in that the word I am searching for may literally be right next to a word that got OCR no problem. Forgive me if this topic has already been tackled. It just seems that if resolution or dpi were the issue, I wouldn't get much of anything. Is there a way to manually add OCR where Adobe misses it? Also, why would it get one "Park," but not all of them. They are practically identical. Help!
Copy link to clipboard
Copied
Hi Jesse,
Here Acrobat must recognize other words as well. What you are seeing in red boxes, are the words where Acrobat is not sure it recognize the word correctly or not. So this word is marked as suspect. Now you can manually correct this word from
1. Go to Enhance Scans> Recognize Text > "Correct recognize Text" after running OCR
2. Select any red box, now this word will be available in toolbar. Original "Image" and "recognized as"
3. You can enter correct text for this image if Acrobat recognize it incorrectly and then Accept.
Also you can see what all recognized by Acrobat. For this select checkbox "Review Recognized text".
Here is a sample image where "be offered" is recognized as "b~offered". We can manually change it and accept.
"Review Recognized text" option is available at top Left corner of toolbar to see all recognized text.
Hope it will resolve your issue. Please let us know if you still face any problem.
Thanks.
Copy link to clipboard
Copied
Hi,
That doesn't really answer my question. In my job, some days I just correct the words that were not recognized correctly by adobe. That tends to be the majority due to the age of the documents I am working with. So I have literally done exactly what you have given me instructions for, tens of thousands of times. Maybe hundreds of thousands, I don't keep track. What I would like to be able to do is at the very least, add recognition to words on the page that aren't even red-boxed as suspects. Words that were missed or ignored entirely. Can anyone help with that. What I really want to know is why it doesn't red-box every word on the page. Why does it get one "park" but not the other six? Most importantly, can I add words that were missed or is this as good as it gets?
Copy link to clipboard
Copied
Sometimes it's not able to recognize text for some specific fonts, dark background, very large text or low DPI image.
Can you please share a sample document where you are facing this issue.
You can use https://cloud.acrobat.com/send for sharing the file.
Thanks.
Copy link to clipboard
Copied
I shared in my original post. Though it is only a portion of the document due to the private confidential information in the rest of the document. I cannot share confidential information in this forum. Also, this isn't just one document, I have scanned close to 60,000 documents and this problem manifests in all of them. The goal is to be able to let the software search them without looking at every document. If I want to search for a property on lot 1 block 3 of whatever subdivision, I want to be able to type "lot 1" and have all documents that contain "lot 1" be found. Due to the fact that all these documents are of the same quality and scanned on the same machine, it makes no sense that it would capture only 5%-10% of the "lot 1" documents. I tried scanning them at higher resolution and surprisingly had the same or more often, worse results. I could find some documents that are shareable, scan them at various resolution and process the same optimization and OCR and show you that it makes no difference and that there must be some other issue at play but that would be a huge waste of my time considering that I have already experimented with this. I read every available forum and troubleshooting blog to try to find the answer. Only after exhausting all possibilities did I resort to actually registering and asking the specific question. If the answer is simply that Adobe isn't all that good at this task, fine, I can believe that, planned obsolescence and all. Maybe the question I should be asking is at what point will they release a version that actually works?
Copy link to clipboard
Copied
What I would like to be able to do is at the very least, add recognition to words on the page that aren't even red-boxed as suspects. Words that were missed or ignored entirely. Can anyone help with that. What I really want to know is why it doesn't red-box every word on the page. Why does it get one "park" but not the other six? Most importantly, can I add words that were missed or is this as good as it gets?
Copy link to clipboard
Copied
I just realized why my question cannot and will not be answered. It's a computer isn't it. All the answers are regurgitated from a database based on keywords in the question. Tell me I'm wrong.
Copy link to clipboard
Copied
No this is not an auto generated messages. We are already looking into this issue. But to reproduce and resolve the issue we need a sample document.
It would be great if you can share only a single page document(without OCR) where you are facing this issue. You can use https://cloud.acrobat.com/send for sharing the file and make this private or directly send this document to lgarg@adobe.com
Copy link to clipboard
Copied
Please share the document(without performing) where you are facing this issue. It will really help us resolve this issue.
Thanks.
Copy link to clipboard
Copied
For the screen shot you shared we are able to recognize the text correctly.
Park, is red boxed only once because Acrobat was not sure if it recognize this word correctly or not. For others it didn't mark it suspect because it recognize the text correctly.
You can check what all text is recognized by Acrobat after selecting checkbox "Review Recognized text".
If any word is recognized incorrectly, just double click on that word and it will be marked as suspect now.
If you still face any issue, please share the file (before running OCR).
Thanks.
Find more inspiration, events, and resources on the new Adobe Community
Explore Now