Extract PDF Page by Form Field Name and Value

Question

This project started with JavaScript for me, but I've since explored other languages and am open to any viable solution.

Disclaimer: 7 months ago I had 0 programming knowledge, but since then have made great strides and learned a lot. Once I complete this project I intend to go back and start from scratch and learn the foundation. I'm also working with a Windows computer.

What I'm looking to do is extract pages of a master PDF file. The pages in question that I want to extract all have 5 questions with "Yes" or "No" checkboxes. I want to extract only those pages have "Yes" checked.

I've poured through FDF, XML and CSV versions of the Master PDF to find trends and view the internal structure of the PDF. I have the Form Names of all the actual checkboxes (both the "Yes" and "No" fields) on the page I need. What I think needs to be done is:

Parse PDF

Read Field Names

If Field Name Value is "Yes" ("Off" is the value for being checked "No")

Then Extract the whole page

I have way more detail and specifics that I can get into if necessary. But if anyone can help show me how to get the whole page based on Field Names with Values of "Yes" it would be much appreciated!

P.S. I'm not married to any specific programming language. JavaScript seemed like a good starting point for me when I began this project due to it's compatibility with Acrobat. I'm open to exploring other languages if necessary.

Thanks again in advance!

sinious · Answer

The first thing I would be curious about is what you intend to do with the results. For any language to help you completely it will need that ability. JavaScript has no filesystem access without Node-esque servers so to give you the best advice we'll really need to know if you're writing the results somewhere, what format that is, or if you haven't figured that out yet. So what are you doing with the results of your parse and will those results need to be publicly visible or is just information for you privately?

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded