• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

extracting pages (splitting file) based on partial word match

New Here ,
Jan 16, 2023 Jan 16, 2023

Copy link to clipboard

Copied

Hello,

I have no programming knowledge and I am going through this website to write a script. I have Acrobat DC.

I have a PDF file with over 400 pages. First page of the file is a coverpage for case and then it follows supporting documents. Number of pages can vary from case to case. But the first page of the case would always be the coverpage. Cover page is the only place that mentions the reference number.  I would like to split the file before the second time the reference numbers appears (it is a certain combination of alphabets appear). Or I would like to extract the cases as separate files The Reference numbers goes something like N30111XXXXX. Where the XXXXX is 5 numbers that are unique and Identify each case. I would like the program to ask for the first 6 alphanumeric numbers from the user and go through the file and extract all the cases and save them using the number e.g. N3011103547. I tried to use to modify the code below that I got from Split PDF based on content, and save into different pdfs with custom file name

The only issue is that it only searches for the whole word and doesn’t look for partial word. So, if the user inputs N30111 nothing is matched but when the use inputs N3011103547, it would only match that page.

Any help would be greatly appreciated.

var numSeq="";

var finalpage=0;

var val = app.response("Enter a value");

for (var p = 0; p < this.numPages; p++) {

   for(var n = 0; n<this.getPageNumWords(p); n++){      

       if(this.getPageNthWord(p,n)== val){

            numSeq=this.getPageNthWord(p,n+1)

            finalpage=p;

            break;

       }

    }

 

    for(var p2=p+1; p2<this.numPages; p2++){

        for(var n2=0; n2<this.getPageNumWords(p2); n2++){

            if(this.getPageNthWord(p2,n2)== val){

                this.extractPages({

                    nStart: finalpage,

                    nEnd: p2-1,

                    cPath: val+ (p+3000) + ".pdf"});

                break;      

            }

        }

       console.println("Extracted " + numSeq + " pp " + p + " to " + p2)

       break;

    }

}

this.extractPages({

    nStart: finalpage,

    nEnd: this.numPages - 1,

    cPath: val+ (finalpage+3000) + ".pdf"

});

console.println("Extracted" + numSeq + " pp " + finalpage + " to " + (this.numPages - 1))

TOPICS
Edit and convert PDFs , How to , JavaScript

Views

266

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

New Here , Jan 17, 2023 Jan 17, 2023

I was able to find the answer for this one. posting it here for anyone else looking:

 

var numSeq="";
var finalpage=0;
var val = app.response("Enter a value");
for (var p = 0; p < this.numPages; p++) {
for(var n = 0; n<this.getPageNumWords(p); n++){
if(this.getPageNthWord(p,n).substring(0,6)== val){
numSeq=this.getPageNthWord(p,n).substring(6,11)
finalpage=p;
break;
}
}

for(var p2=p+1; p2<this.numPages; p2++){
for(var n2=0; n2<this.getPageNumWords(p2); n2++){
if(this.getPageNthWord(p2,n2).substring(0,6)== val)

...

Votes

Translate

Translate
New Here ,
Jan 17, 2023 Jan 17, 2023

Copy link to clipboard

Copied

I was able to find the answer for this one. posting it here for anyone else looking:

 

var numSeq="";
var finalpage=0;
var val = app.response("Enter a value");
for (var p = 0; p < this.numPages; p++) {
for(var n = 0; n<this.getPageNumWords(p); n++){
if(this.getPageNthWord(p,n).substring(0,6)== val){
numSeq=this.getPageNthWord(p,n).substring(6,11)
finalpage=p;
break;
}
}

for(var p2=p+1; p2<this.numPages; p2++){
for(var n2=0; n2<this.getPageNumWords(p2); n2++){
if(this.getPageNthWord(p2,n2).substring(0,6)== val){
this.extractPages({
nStart: finalpage,
nEnd: p2-1,
cPath: val+numSeq+".pdf"});
break;
}
}
console.println("Extracted " + numSeq + " pp " + p + " to " + p2)
break;
}
}
this.extractPages({
nStart: finalpage,
nEnd: this.numPages - 1,
cPath: val+numSeq+".pdf"
});
console.println("Extracted" + numSeq + " pp " + finalpage + " to " + (this.numPages - 1))

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Feb 04, 2023 Feb 04, 2023

Copy link to clipboard

Copied

LATEST

Hello,

I see you posted the correct script that you found, but could you please share how to execute the script also.

I have a pdf document with 200 plus pages and need to print every page as a separate file and save with a text in the file.

Is this achieveable with your code.

Thanks in advance.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines