OCR from the same areas on several pages

Question

I am hoping for a solution using the Action wizard only for the following problem. I am pretty new to Acrobat so I am sure there are many options I have not yet considered for my task.

I am working with a large series of typewritten forms that were scanned. These forms contain information that should be read semi-automatically. The same information is in the same area on every page and every page has 6 "areas of interest" that contain said information. The rest of the page is different from page to page so OCRing entire pages would create different levels of noise depending on the page. That is why I want to OCR only those areas of interest and get the output as plaintext. (The goal of the data is in Excel so I will try to get the output in there as directly as possible by VBA, although reading from exported files in VBA is possible, too.)

I was able to create an action that lets the user crop every page down to one of the aforementioned areas and then run OCR and output the text automatically. This process will force the user to wait while only one of the areas is processed, and then repeat it once for every area of the form.

Ideally, I would want the user to select all fields on one page, this pattern to be applied to every page, and then OCR data to be exported for those fields separately.

Alternatively, getting coordinate data from a user's selection would also work as I could use them in VBA to automate the cropping process.

For these two strategies, I haven't found appropriate commands in Acrobat yet.

Does anyone have an idea about what I can do?

Bernd Alheit · Answer

You can create 5 copies of every page. Then crop the pages at the different coordinates. After this OCR the whole document.Or redact unwanted areas.

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded