Copy link to clipboard
Copied
I have Acrobat Pro DC
I have a problem in my current organisation which uses a very old fashioned HR system for recruitment. Our HR system compiles one massive report of all the job applications for a recent post: the pdf is 1700+ pages long, containing distinct sections (of variable length) for over 200 applicants.
I want to split this into one pdf per applicant, with the filename of each document being the applicant's name.
For each new application, a consistently formatted divider page exists as follows:
Applicant : Smith, John
Vacancy ID : 15535
The text 'Vacancy ID' only exists on these divider pages, so it can be used to identify where to split the document.
The applicant's name, which occurs on a previous line, starts at character 10 and is variable length. In fact it can be acquired with getPageNthWord(page,3) and getPageNthWord(page,4)
How easy would it be to create some javascript to run in an action which would do the following:
Can this be done, or has it been done already? Thanks
Thanks. Unfortunately I don't have a budget for this work so I figured it out myself. Here is the solution in case anyone else needs to do something similar. Obviously you will need to tweak the code for your scenario. I ran this in the javascript debugger using instructions (eg select code and press ctrl enter) from this site https://acrobatusers.com/tutorials/javascript_console
In short, this script does the following:
Copy link to clipboard
Copied
If the pages are consistent and the text readable (ie, not part of a scanned image), then yes, it can most likely be done.
I've developed many similar tools for my clients in the past, so if you wish to send me some sample pages (to try6767 at gmail.com) I'll be happy to let you know if I think it's doable or not, and if so, for how much.
Copy link to clipboard
Copied
Thanks. Unfortunately I don't have a budget for this work so I figured it out myself. Here is the solution in case anyone else needs to do something similar. Obviously you will need to tweak the code for your scenario. I ran this in the javascript debugger using instructions (eg select code and press ctrl enter) from this site https://acrobatusers.com/tutorials/javascript_console
In short, this script does the following:
I'm sure there are lots of better ways of doing it, but this works for me, it took about an hour, and I didn't have to pay anyone (sorry try67). Also, someone else might be able to use this for free in future. Let me know if you have any problems and I'll try to help. I've never used JavaScript before but it doesn't seem to be too hard. Debugging in acrobat however is AWFUL! Good luck.
var firstName = ""
var surName = ""
var finalpage = 0
var count = 0
//For each page in document, check whether specific words meet criteria
for (var p = 0; p < this.numPages; p++) {
if (this.getPageNthWord(p, 8) == "Vacancy") {
if (this.getPageNthWord(p, 9) == "ID") {
count++;
firstName = getPageNthWord(p, 3);
surName = getPageNthWord(p, 2);
finalpage = p;
//Find page position of next break point
for (var p2 = p + 1; p2 < this.numPages; p2++) {
if (this.getPageNthWord(p2, 8) == "Vacancy") {
if (this.getPageNthWord(p2, 9) == "ID") {
this.extractPages({
nStart: p,
nEnd: p2-1,
cPath: count + " " + firstName + " " + surName + ".pdf"
});
console.println("Extracted " + firstName + " " + surName + " pp " + p + " to " + p2)
break
}
}
}
}
}
}
//Save final section after last time run through
this.extractPages({
nStart: finalpage,
nEnd: this.numPages - 1,
cPath: count + " " + firstName + " " + surName + ".pdf"
});
console.println("Extracted " + firstName + " " + surName + " pp " + finalpage + " to " + (this.numPages - 1))
Copy link to clipboard
Copied
Perphaps you can help me. I have a problem similar to what you had.
I have a file that has serveral pages. At random intervals there are pages that have the words "PageBreak".
Looking for a script that will
1. For each page in document, look for the word "PageBreak"
2. Extract all pages before and including the page with the first instance of "PageBreak" into a new document.
3. Continue through the document until we find the next instance of "PageBreak" and repeat step 2
I have attached a file. On the file, pages 1 and 2 would be extracted to a new file,
pages 3-5 would be extracted to another new file
pages 6-7 would be a new file and page 8 would be left and should be on a new file by itself.
I understand you are not an expert, neither am I and I could really use some guidance on getting this accomplished.
Any help would be greatly appreciated.
Thank you in advance.
Copy link to clipboard
Copied
This is possible, but not a simple project if you don't have any experience at all in writing Acrobat JavaScript code. The offer I made above still stands... I'm happy to develop this tool for you, for a small fee. My contact details are the same.