Copy link to clipboard
Copied
I'm considering convincing my company to upgrade to Acrobat Pro so I can automate the processing of my scanned documents. Before I bring it up, I want to make sure the things I want to do are possible. I don't need anyone to give me the code, I just want to know if this is possible.
The documents i'm working with are landscape, 2-5 pages, and have the filename and page numbers in the footer. I want to scan a big stack of them and have a script perform the following actions:
Use OCR to acquire the filename and page numbers for each page. I would like to restrict the OCR to only look at the footer to save time and RAM.
Using the filenames, I want it to detect when one document ends and the next one begins so they can be split into separate files. Alternatively I could separate my documents by number of pages and scan them in separate stacks, but I would to have an input box to tell the script how many pages each document has.
Before saving the split files, check that the number of pages in the file matches the page total in the footer. (I work in a factory and the documents can get sticky, so my scanner frequently pulls two pages at once)
Instead of saving the files where the page total doesn't match, compile a list of the errors so I know which documents need to be rescanned.
Finally, save all correct documents with their filenames from the footer to a folder on my desktop.
This could save me hours a week, so I'm hopeful that it's all possible. Thanks
Copy link to clipboard
Copied
I think you may need to look for specialist high volume OCR software. Acrobat's OCR can't be programmatically directed in this way. Given the host of the forum, this isn't the place for recommendations...
Copy link to clipboard
Copied
You would first need to run OCR on the files (using Acrobat or something else). If the results are good then the rest can be done either using Acrobat (although it's not a very good batch processing tool) or a stand-alone application. Either way, it will require the development of custom-made code to do it.
Copy link to clipboard
Copied
Can I call the OCR tool in my code or would I have to do that before hand?
Copy link to clipboard
Copied
You have to do that before-hand.
Copy link to clipboard
Copied
OCR is an option for an Action. But it applies to the whole page. Acquiring text from the OCR is a different story. This requires a custom script. If the items your looking for have a standard location, or some kind of unique structure, then this could be fairly simple, but probably time consuming.
Since it will save hours every week it is well worth the expense to automate. Send me an email, I'd be happy to help you out. Contact me through www.windjack.com
Copy link to clipboard
Copied
Thanks, I'll see how far I can get with online tutorials. I'll contact you if I run into problems I can't figure out on my own.