Extracting Pages Based on Matching Strings in Acrobat Pro DC Java Script
Copy link to clipboard
Copied
I need to extract pages from a PDF document with matching strings i.e. Acrobat create a new file of all pages where it finds strings that I have in CSV or xlsx file
I only need pages having the following two strings...
macros
salesperson
I found following code while googling around but it searches only one string and creates a new file of pages matching that string. While I need to search multiple strings and need only one file. Any ideas please...
// Iterates over all pages and find a given string and extracts all
// pages on which that string is found to a new file.
var pageArray = [];
var stringToSearchFor = "Test";
for (var p = 0; p < this.numPages; p++) {
// iterate over all words
for (var n = 0; n < this.getPageNumWords(p); n++) {
if (this.getPageNthWord(p, n) == stringToSearchFor) {
pageArray.push(p);
break;
}
}
}
if (pageArray.length > 0) {
// extract all pages that contain the string into a new document
var d = app.newDoc(); // this will add a blank page - we need to remove that once we are done
for (var n = 0; n < pageArray.length; n++) {
d.insertPages( {
nPage: d.numPages-1,
cPath: this.path,
nStart: pageArray[n],
nEnd: pageArray[n],
} );
}
// remove the first page
d.deletePages(0);
}
I assume that some code will be added to load CSV/XLSX file and a FOR/WHILE loop to search all strings in that PDF file and storing their page numbers and then creating a new file with all these page numbers.
Copy link to clipboard
Copied
Change this line
var stringToSearchFor = "Test";
To:
var stringsToSearchFor = ["macros", "salesperson"];
And this line:
if (this.getPageNthWord(p, n) == stringToSearchFor) {
To:
if (stringsToSearchFor.indexOf(this.getPageNthWord(p, n)!=-1) {

