Copy link to clipboard
Copied
My organization uses third party software (OpenText - Exstream) to create PDF files. Within one PDF file are multiple, multi-page documents for different recipients. The first page of each document contains a white line of data (~30 characters) in the exact same location across all documents. I need to be able to extract just this data from the first page of each document and output it to an external file (preferably a text file). Any thoughts? Thanks in advance.
Copy link to clipboard
Copied
Were you planning on doing this extraction with Acrobat, or do you have some other PDF tool?
Copy link to clipboard
Copied
That is what I'm trying to determine. What are my options? Does Acrobat do what I'm describing? My research suggests the Redact functionality might be relevant. However, I don't know about the extracting to an external file requirement using Redact.
Copy link to clipboard
Copied
In Acrobat, a a script or plug-in can scan page content. So yes, Acrobat can do this.
I written scripts for doing exactly this type of thing many times.
In JavaScript the relevant fucntions are "doc.getPageNthWord()" and "doc.getPageNthWordQuad()".
Here's the reference entry:
Copy link to clipboard
Copied
How many pages do your files can have?
Copy link to clipboard
Copied
Any one file can have hundreds of thousands of pages. But the data only exists on the first page of each document.
Copy link to clipboard
Copied
And how would the tool know where each "document" starts within the file?
Copy link to clipboard
Copied
I don't know as it depends on the robustness of the tool. I can tell you the data resides consistently in the same location on the page and if it doesn't exist on a page, then nothing will be in that location.
Copy link to clipboard
Copied
In that case I would not recommend doing it in Acrobat. It's just too much for a script to be able to handle.
I would do it using a stand-alone tool, which is much more robust and can process much larger files, much faster. If you're interested in hiring someone to develop such a tool for you feel free to contact me privately via [try6767 at gmail.com] to discuss it further.
Copy link to clipboard
Copied
Thank you. I will add it to my list of possibilities which, as of the moment, is a list of one. LOL
Get ready! An upgraded Adobe Community experience is coming in January.
Learn more