Copy link to clipboard
Copied
I am creating an application that will provide the user with very limited specific information about a large pdf file (300 to 400 pages). OCRing the file is too time consuming and gives me too much irrelevant information. I can zero in on data I need based upon a couple of things: the name of the bookmark, and the general layout of the pages of that bookmarked exhibit. The pages are not pdf forms, they are basically boilerplate forms that present data in a uniform way. I would like to be able to create code that would harvest that text data. Essentially I am wanted to use the positioning of the data on the form to tell me what the data is. So, for example I know that "01/01/1998" is a birthdate based upon where it is positioned on a general page. Is this possible?
Copy link to clipboard
Copied
Sorry marked as answered : not.
Copy link to clipboard
Copied
You would still need OCR. This is referred to a zoned OCR. Acrobat doesn’t do that.