Copy link to clipboard
Copied
Hello. I'm a beginner in javascript, and I have adobe acrobat X pro.
I want to be able to search for a specific string within the pdf, and then save the sequence of numbers that come after that string for my file name.
For example, the text would look like "EA 224400" . I would search for the string EA, and then save 224400 as a different variable that would later be used for the file name. This sort of text is found multiple times in my pdf on different pages. Therefore, I want to be able to search for EA, save the number sequence after EA, split document at that point, saving the pages from current page (typically 5, though not always) up to page before next instance of "EA", then save an individual pdf for each number sequence using the number sequence for the pdf file name.
I found a different discussion that had almost the exact same concept that I wanted (Split large pdf on repeated text pattern, and save new pdf with custom filename ), however, the only difference is that I want to search for "EA" in the document rather than the user already knowing the location of "EA".
This is the code that I tried to use based off of the other forum, but it has a syntax error
var numSeq="";
var finalpage=0;
for (var p = 0; p < this.numPages; p++) {
for(var n = 0; n<this.getPageNumWords(p); n++{
if(this.getPageNthWord(p,n)=="EA"){
numSeq=this.getPageNthWord(p,n+1)
finalpage=p;
break;
}
}
for(var p2=p+1; p2<this.numPages; p2++){
for(var n2=0; n2<this.getPageNumWords(p2); n2++){
if(this.getPageNthWord(p2,n2)=="EA"){
this.extractPages({
nStart: p,
nEnd: p2-1,
cPath: numSeq+".pdf"});
break;
}
}
console.println("Extracted " + numSeq + " pp " + p + " to " + p2)
break;
}
}
this.extractPages({
nStart: finalpage,
nEnd: this.numPages - 1,
cPath: numSeq + ".pdf"
});
console.println("Extracted" + numSeq + " pp " + finalpage + " to " + (this.numPages - 1))
var numSeq="";
var finalpage=0;
for (var p = 0; p < this.numPages; p++) {
for(var n = 0; n<this.getPageNumWords(p); n++){
if(this.getPageNthWord(p,n)=="EA"){
numSeq=this.getPageNthWord(p,n+1)
finalpage=p;
break;
}
}
for(var p2=p+1; p2<this.numPages; p2++){
for(var n2=0; n2<this.getPageNumWords(p2); n2++){
if(this.getPageNthWord(p2,n2)=="EA"){
this.extractPages({
nStart: p,
...
Copy link to clipboard
Copied
Have you checked to see what the value of the string returned by "getPageNthWord" is?
For your example, I would expect it to be "EA 224400", so you are not getting a value of "EA": and that means no match. You could check to see if the first 2 characters returned are "EA" and then do your save. Or you could use the RegExp object to test the picture image of the returned value to see if it is the format "EA" followed by a number value.
Copy link to clipboard
Copied
var numSeq="";
var finalpage=0;
for (var p = 0; p < this.numPages; p++) {
for(var n = 0; n<this.getPageNumWords(p); n++){
if(this.getPageNthWord(p,n)=="EA"){
numSeq=this.getPageNthWord(p,n+1)
finalpage=p;
break;
}
}
for(var p2=p+1; p2<this.numPages; p2++){
for(var n2=0; n2<this.getPageNumWords(p2); n2++){
if(this.getPageNthWord(p2,n2)=="EA"){
this.extractPages({
nStart: p,
nEnd: p2-1,
cPath: numSeq+".pdf"});
break;
}
}
console.println("Extracted " + numSeq + " pp " + p + " to " + p2)
break;
}
}
this.extractPages({
nStart: finalpage,
nEnd: this.numPages - 1,
cPath: numSeq + ".pdf"
});
console.println("Extracted" + numSeq + " pp " + finalpage + " to " + (this.numPages - 1))
Copy link to clipboard
Copied
This answer is right, except for some reason I have to put p-1 for for first nStart.
Thanks though
Copy link to clipboard
Copied
To extract the numbers, you could loop throught all fields of the doc, use the split() method of the string object to separate each words, put everything in an array, use the indexOf method to spot the occurence of "EA" and target the +1 indice.
Is there any chance you might encounter EA twice in the same field?
Copy link to clipboard
Copied
Is it possible to change the code a little to work for something like....
EA: ID:
12456 889955
where it would detect EA, and somehow read the numbers underneath it?
Copy link to clipboard
Copied
If the values are in different fields, and you used a naming convention for those fields, it is possible.
Copy link to clipboard
Copied
Could you elaborate please? Sorry I'm new to javascript.