extracting pages (splitting file) based on partial word match

Report · Jan 16, 2023

Hello,

I have no programming knowledge and I am going through this website to write a script. I have Acrobat DC.

I have a PDF file with over 400 pages. First page of the file is a coverpage for case and then it follows supporting documents. Number of pages can vary from case to case. But the first page of the case would always be the coverpage. Cover page is the only place that mentions the reference number. I would like to split the file before the second time the reference numbers appears (it is a certain combination of alphabets appear). Or I would like to extract the cases as separate files The Reference numbers goes something like N30111XXXXX. Where the XXXXX is 5 numbers that are unique and Identify each case. I would like the program to ask for the first 6 alphanumeric numbers from the user and go through the file and extract all the cases and save them using the number e.g. N3011103547. I tried to use to modify the code below that I got from Split PDF based on content, and save into different pdfs with custom file name

The only issue is that it only searches for the whole word and doesn’t look for partial word. So, if the user inputs N30111 nothing is matched but when the use inputs N3011103547, it would only match that page.

Any help would be greatly appreciated.

var numSeq="";

var finalpage=0;

var val = app.response("Enter a value");

for (var p = 0; p < this.numPages; p++) {

for(var n = 0; n<this.getPageNumWords(p); n++){

if(this.getPageNthWord(p,n)== val){

numSeq=this.getPageNthWord(p,n+1)

finalpage=p;

break;

}

for(var p2=p+1; p2<this.numPages; p2++){

for(var n2=0; n2<this.getPageNumWords(p2); n2++){

if(this.getPageNthWord(p2,n2)== val){

this.extractPages({

nStart: finalpage,

nEnd: p2-1,

cPath: val+ (p+3000) + ".pdf"});

break;

}

console.println("Extracted " + numSeq + " pp " + p + " to " + p2)

break;

}

this.extractPages({

nStart: finalpage,

nEnd: this.numPages - 1,

cPath: val+ (finalpage+3000) + ".pdf"

});

console.println("Extracted" + numSeq + " pp " + finalpage + " to " + (this.numPages - 1))

Report · Jan 17, 2023

I was able to find the answer for this one. posting it here for anyone else looking:

Report · Feb 04, 2023

Hello,

I see you posted the correct script that you found, but could you please share how to execute the script also.

I have a pdf document with 200 plus pages and need to print every page as a separate file and save with a text in the file.

Is this achieveable with your code.

Thanks in advance.

Adobe Community

extracting pages (splitting file) based on partial word match

1 Correct answer