Extracting Pages Based on Matching Strings in Acrobat Pro DC Java Script
I need to extract pages from a PDF document with matching strings i.e. Acrobat create a new file of all pages where it finds strings that I have in CSV or xlsx file
This is a sample PDF file from which I only need pages having following two strings...
- macros
- salesperson
I found following code here while googling around but it searches only one string and creates a new file of pages matching that string. While I need to search multiple strings and need only one file. Any ideas please...
// Iterates over all pages and find a given string and extracts all // pages on which that string is found to a new file. var pageArray = []; var stringToSearchFor = "Test"; for (var p = 0; p < this.numPages; p++) { // iterate over all words for (var n = 0; n < this.getPageNumWords(p); n++) { if (this.getPageNthWord(p, n) == stringToSearchFor) { pageArray.push(p); break; } } } if (pageArray.length > 0) { // extract all pages that contain the string into a new document var d = app.newDoc(); // this will add a blank page - we need to remove that once we are done for (var n = 0; n < pageArray.length; n++) { d.insertPages( { nPage: d.numPages-1, cPath: this.path, nStart: pageArray[n], nEnd: pageArray[n], } ); } // remove the first page d.deletePages(0); }
I assume that some code will be added to load CSV/XLSX file and a FOR/WHILE loop to search all strings in that PDF file and storing their page numbers and then creating a new file with all these page numbers.
