Copy link to clipboard
Copied
Please help with how to best to do this.
Once I scan in a previously scratched lottery ticket into a PDF format. How can I?
1. Have the software recognize the numbers only in a specified area. Ideally, only a small box.
2. Have the software identity matching numbers.
Thanks,
Blind Lottery Player.
If you want to do this in Acrobat's JavaScript, be prepared for some serious scripting - and even with that, it may not possible to solve this problem at all, depending on the exact requirements.
First, after scanning (or as part of the scanning process), you need to run OCR (optical character recognition) so that Acrobat can actually extract (or find) text in the document.
Once you have text that can be extracted or searched for, you can then run a JavaScript that does the following:
- Crop the
...Copy link to clipboard
Copied
Hello,
By default, there is no such option or feature in Acrobat can help you with this.
I am moving this thread to Javascript forum space and wait for the expert comments. Hopefully, they may have some suggestions for your queries.
-Tariq Dar
Copy link to clipboard
Copied
If you want to do this in Acrobat's JavaScript, be prepared for some serious scripting - and even with that, it may not possible to solve this problem at all, depending on the exact requirements.
First, after scanning (or as part of the scanning process), you need to run OCR (optical character recognition) so that Acrobat can actually extract (or find) text in the document.
Once you have text that can be extracted or searched for, you can then run a JavaScript that does the following:
- Crop the page to the area in which you expect the number to be.
- Use the JavaScript function Doc.getPageNthWord() (Acrobat DC SDK Documentation) to loop over all "words" in the crop area. This will very likely give you the number you are looking for.
- Reverse the crop from the first step to have the whole page again. This is where it gets tricky, because you actually need to store the difference you applied in the first step, and reverse that exactly.
- Repeat this for any other area you are interested in.
When you look through the archives of this forum (and over at http://acrobatusers.com - e.g. this one for how to reverse the crop box back to the whole page: Reverse Crop With Javascript (JavaScript) ), you should be able to find all the parts you need to create such a solution.
Keep in mind that this all depends on the quality of the scan. If Acrobat cannot recognize text because of low scan quality, then there is nothing you can do. Also, one problem with scanned documents is that there may be too much variation in the location of the text objects you are interested in. There is another API function, that gives you the location of a "word" (Doc.getPageNthWordQuads - Acrobat DC SDK Documentation) - you could use that to search for known text on the scanned page to calibrate your text finding algorithm.
Copy link to clipboard
Copied
As mentioned, this might be possible using a complex script, but since it needs to rely on the results of an OCR process I would think twice before doing it. Such processes often produce non-perfect results and in this case a mistake could lead to an incorrect positive, or (even worse) an incorrect negative result. For example, if the OCR process recognizes an 8 as a 0 (not an uncommon mistake) it could mean you actually have a winning ticket but would not find out about it...