JS to Extract Multiple Pages and Name as Page Label

Report · Mar 10, 2016

Good Morning! We have a group of employees that often taken a PDF that contains multiple pages (in the range of 200-500 pages) and extract them out into separate PDF's to load into a system. Often times, these large PDF's that they begin with hold page labels for each and every page, but when extracting, they lose these page labels and the individual files are named differently. What we're aiming to do is have a javascript that they can run to extract all pages out of a multi-paged PDF and in that process, it name the individual PDF files the same name as their page label.

I believe the two functions that we're dealing with are Doc.getPageLabel() and Doc.extractPages() but we're unsure as to how to tie this into a javascript that will do what we need it to do. Unfortunately, none of us have any experience with JS. I appreciate any and every ones help! Thank you!

Report · Mar 10, 2016

From what context do you want to use this script? From a menu item? An Action? The JS Console?

Basically you have it right. The only thing missing is a loop that iterates over all the pages, extracting each one using its label.

The basic code would be something like this:

for (var p=0; p<this.numPages; p++) {
    this.extractPages(p, p, this.path.replace(this.documentFileName, this.getPageLabel(p) + ".pdf"));
}

Report · Mar 10, 2016

What ever would best suit this objective. I was assuming that it would need to be an 'Add-On Tool' but if there is a better way of approaching it, we are definitely open to advice. Thank you!

Report · Mar 10, 2016

That's possible, but it would require a more complex script. If you have Acrobat Pro you can just create a new Action, put that code into it and then run it on your files directly. That's probably the easiest way of using it.

Report · Jul 18, 2018

This totally worked for me, exact same situation. Thanks a ton!

Report · Dec 14, 2023

how do you run it, where do you run it?

Report · Dec 14, 2023

From the JS Console: Press Ctrl+J, paste the code into the console, select it with the mouse and press Ctrl+Enter to run it.

Report · Mar 15, 2024

what is wrong with me?

Report · Mar 15, 2024

You must select (with the mouse or keyboard) the full code before executing it.

Report · Mar 15, 2024

Thank You! worked! is it possible make it without number ?

Report · Mar 15, 2024

This is a part of the page label, most likely. Try replacing this part of the code:

this.getPageLabel(p)

With:

this.getPageLabel(p).replace(/^\[\d+\]\s/, "")

Report · Mar 18, 2024

Thank you! it is worked.

Report · May 20, 2024

Wow, that is what i was looking for.

Can you write it in this way please:
filename.pdf to become filename___pagelabel.pdf

Thank you so much.

Report · May 20, 2024

Sure. Use this:

this.extractPages(p, p, this.path.replace(".pdf", "___" + this.getPageLabel(p) + ".pdf"));

Report · May 20, 2024

Thank you so much!!

Report · Oct 02, 2024

I need some clarification on the steps to extrac the pages and keep the page labels. Here is what i am doing, is this correct?

1) Open Pdf

2) Select "Organize Pages"

3) Highlight the pages i want to extract

4) Use Control J to pen up the Java window

5) input the script (see below)

for (var p=0; p<this.numPages; p++)
{
this.extractPages(p, p, this.path.replace(this.documentFileName, this.getPageLabel(p) + ".pdf"));
}

6) highlight the script and use CTRL Enter to run it.

I get the follwoing error:

RaiseError: The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.
Doc.extractPages:3:Console undefined:Exec
===> The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.

undefined

Below is a screenshot of the whole page. I have tried on multipole pdf's and get the same issue. What am i doing wrong?

Report · Oct 02, 2024

Yes, that's it. It should work... Make sure the file is not located in a special folder, though, like the root drive folder (C:\), or something like C:\Windows, or a network folder. Also make sure you have full read/write permissions to that folder.

Report · Oct 02, 2024

If it still doesn't work, go to Menu - Preferences - Security (Enhanced) and make sure that everything there is disabled.

Report · Oct 02, 2024

The security was enabled. Disabling it worked. Thank you very much!!

Report · Mar 31, 2025

I save my files in a folder on the desktop, the path is C:\Users\*user*\Desktop\Plan Conversions No matter what I do I continue to get the error in the screenshot below. Strangely though, some files from the PDF will be extracted with their appropriate names, but not the files I've chosen to extract, it seems to be random. It would make my work life a lot easier if I could manage to get this to work correctly - any insight would be helpful. Thank you for any additional information you can provide.

Report · Mar 31, 2025

You need to make sure that:

- The new file-name is valid, ie. doesn't contain any characters that can't be used for a file-name, such as:

/ \ * :

For some reason, a script also can't save a file-name with a comma in it.

You must remove all of these characters from the string before you use it as the new file-name.

- The new file-name is not the same as the one of the original file, if you're trying to save the extracted pages in the same folder.

Report · Mar 31, 2025

I saved this particular file as henderson.pdf - no strange characters. The pages being extracted are architectural plans all saved in one file with the individual pages being named something like A-101 Floor Plan. Could it be the dash in the page name causing the problem somehow? I wouldn't think so because this script worked for me once in a similar scenario, but I have not been able to use it again and I have not made any changes to my work flow since.

Report · Mar 31, 2025

The issue is not with the original file name, but the one you're trying to use to save the pages, that is, the page label. If you could share the file I would be able to provide further help.

Report · Mar 31, 2025

Please see below, and thank you for your assistance.

https://acrobat.adobe.com/id/urn:aaid:sc:US:b65f8c60-b5cb-4b7b-ba11-f7d96f5362b9

Report · Mar 31, 2025

Pages 76-78 contain a forward-slash in the page name, which is causing it to fail. There might be other issues, but that's the first one I've encountered so I stopped there. I recommend you add this line before the extractPages command, so you could see where it got stuck in the loop:

console.println(p);

Remember the page numbers are 0-based, so if it shows "75" it means page 76...

JS to Extract Multiple Pages and Name as Page Label

Photos