Skip to main content
Participating Frequently
November 13, 2017
Answered

Extract Pages from comma delimited list and

  • November 13, 2017
  • 2 replies
  • 2381 views

I am working with really large (numbers of pages) pdfs.  I know of two ways to extract non-sequential pages: right-click from either the page view OR the organize pages view.

The problem is that I want to extract many different non-sequential pages (could be 200 pages out of 1100).  I keep a list of all pages indexed in EXCEL (don't ask...that's just what we are doin right now ) .  Anyway, as part of that process I use EXCEL to identify which pages will be extracted (and sent to third parties) from the larger *.pdf file.  Because of that, I can easily using EXCEL VBA create a comma delimited list of which pages to extract from the *.pdf file.

My problem is I am a complete novice at Javascript.  So right now, I have to open up Adobe and hold the CNTRL key while I scroll thru and try and select dozens (if not hundreds) of non-sequential page for extraction.  One slip and I have to start over again.

I just want to be able to feed (could be copy-paste) a list of comma-delimited page number to Adobe and have it extract those pages to one *.pdf.   My thought is that a javascript in Adobe would have a variable assigned to the pages to be extracted, say N=1,2,4,54,23,45,198,543.  So that using excel, I could just copy paste my VBA comma-delimited string of pages to extract output right in for "N" in the Adobe javascipt and then run the javascript.

Can anyone help me?

This topic has been closed for replies.
Correct answer Thom Parker

My Excel VBA outputs a comma-delimited list in one cell.  To test the script I just copy-pasted into the script between the quotation marks (var strPgs = "2,3,4,5,6,10"; // 1-based page numbers!)

So, my thought would be an Adobe dialogue box would pop up and I just copy paste the comma-delimited list from Excel into the dialogue box and press enter.  The list would be inserted between the quotation marks and then execute.

Does that not work?


The simple solution is to use the  "app.response()" fucntion. It returns a string.

If you are using Acrobat DC, then use a command with this script.

var strPgs = app.response("Enter list of page numbers");
var aPages = strPgs.split(","); 
var oNewDoc = app.newDoc(); 
var oDoc = this
aPages.forEach(function(nPg){oNewDoc.insertPages(oNewDoc.numPages-1,oDoc.path,nPg-1);}); 
oNewDoc.deletePages(); 

2 replies

Thom Parker
Community Expert
Community Expert
November 13, 2017

This question comes up regularly. I've always thought it odd that Adobe didn't provide better page extraction options. Later tools, such as "redaction" take a page list/range as input, but they never updated the older tools that also need pages as input.

So the only option is to write an automation script to do it.

If you just go with the list (not page ranges), and assume the page numbers are 0 based, then this script will work

var strPgs = "2,3,4,5,6,10";

var aPages = strPgs.split(",");

var oNewDoc = app.newDoc();

var oDoc = this;

aPages.forEach(function(nPg){oNewDoc.insertPages(oNewDoc.numPages-1,oDoc.path,nPg);});

Run it from the Console Window

Also a good idea for a tool at pdfscripting.com

Thom Parker - Software Developer at PDFScriptingUse the Acrobat JavaScript Reference early and often
Participating Frequently
November 14, 2017

That is what I'm looking for, Thom!  Only problem I had is that the pages extracted (when I run your code exactly as above to test) are shifted by one.

So instead of extracting the 6 pages consisting of pages 2,3,4,5,6 and 10 in your example above, what actually was extracted was 7 pages consisting of one blank page + pages 3,4,5,6,7 and 11?

Any thoughts?

Karl Heinz  Kremer
Community Expert
Community Expert
November 14, 2017

Acrobat starts to count pages with 0 - so the first page in a document is page 0, the second page is page 1 and so on. This is standard behavior for anybody with a software engineering background, and needs some getting used to for somebody who is not a software engineer

try67
Community Expert
Community Expert
November 13, 2017

You can do it like this:

- Create a new file (app.newDoc)

- Import the pages from the old files into the new one (insertPages method of the Document object)

- Delete the first, empty page of the new file (deletePages method)

- Save the new file under a new name (saveAs method)

If you're interested, I've already developed a tool that allows you to do it easily by entering the page numbers into a dialog window.

You can even use ranges, so instead of writing "2,3,4,5,6,10", you can enter "2-6,10" and it will pick up the correct pages automatically.

Of course, the pages don't have to be in sequential order, so it can also be "10, 2-6", if you wish.

You can find it here: Custom-made Adobe Scripts: Acrobat -- Extract Non-Sequential Pages