JS to Extract Multiple Pages and Name as Page Label

Report · Mar 10, 2016

Good Morning! We have a group of employees that often taken a PDF that contains multiple pages (in the range of 200-500 pages) and extract them out into separate PDF's to load into a system. Often times, these large PDF's that they begin with hold page labels for each and every page, but when extracting, they lose these page labels and the individual files are named differently. What we're aiming to do is have a javascript that they can run to extract all pages out of a multi-paged PDF and in that process, it name the individual PDF files the same name as their page label.

I believe the two functions that we're dealing with are Doc.getPageLabel() and Doc.extractPages() but we're unsure as to how to tie this into a javascript that will do what we need it to do. Unfortunately, none of us have any experience with JS. I appreciate any and every ones help! Thank you!

Report · Apr 16, 2025

Different file, same problem. It doesn't even try to convert the pages I have selected, it just tries to convert the entire file and fails after a random number of pages are converted. I open the PDF, go into the "organize pages" section, select the pages I want, CTL+J, enter the script:

for (var p=0; p<this.numPages; p++)
{
this.extractPages(p, p, this.path.replace(this.documentFileName, this.getPageLabel(p) + ".pdf"));
}

And it randomly spits out pages that I didn't select and errors out. I am on a local PC, with the file saved in a folder on my desktop. Security stuff is disabled in Adobe and I have full RW permission to the folder.

RaiseError: The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.
Doc.extractPages:3:Console undefined:Exec
===> The file may be read-only, or another user may have it open. Please save the document with a different name or in a different folder.

undefined

Report · Apr 16, 2025

See my advice from earlier.

Report · Apr 16, 2025

I believe I have tried all of your suggestions. I'm coming to the realization that blueprints probably have something inherent in them that prevents this from working.

Report · Apr 16, 2025

Did you add the command to print out the page number, so you could see which one is causing the error?

Report · Apr 17, 2025

It gave me this:

TypeError: Invalid argument type.
Doc.getPageLabel:5:Console undefined:Exec
===> Parameter nPage.
undefined

If I understand that correctly page 6 of my file is causing the error? The name of that page is A101 - Overall Phasing Plan

No strange characters. In fact, I checked all of the names of the pages previously, no forward slashes, no odd characters. They're all in the format of "Plan Page Type - Name" aka "A101 - Overall Phasing Plan"

Report · Apr 17, 2025

Can you share the file?

Report · Apr 17, 2025

Yes, please find it here: https://acrobat.adobe.com/id/urn:aaid:sc:US:a300da13-6d14-4bd7-8976-aece2e3e50a9

Report · Apr 17, 2025

If you would have done what I suggested before the problem would have been apparent... Here's that code:

for (var p=0; p<this.numPages; p++) {
	console.println("Extracting page " + (p+1));
	var pageName = this.getPageLabel(p);
	console.println("Page name: " + pageName);
	this.extractPages(p, p, this.path.replace(this.documentFileName, pageName + ".pdf"));
}

Report · Apr 17, 2025

I'm sorry - I don't understand code or what any of it means. I'm just a sales guy trying to break up project plans I'm sent by my customers. I thought I entered it correctly and it gave me some result, but I don't know what I'm looking at.

Report · Apr 17, 2025

This allows you to track where the error took place via the output in the Console (which is where you're running the code from). Namely, here:

Extracting page 46
Page name: A311 - Door/Window Details

This page's label contains a slash, which can't be used as a part of a file's name.

Since you're not a programmer you should heed the advice you're given here by those who are.

Alternatively, I'm happy to create for you tool that will do all of this with a single click, including removing the characters that are not valid for a file-name, for a small fee.

Report · Apr 17, 2025

It's tough to heed the advice when I'm not exactly sure what to even look for - but point taken. I give my customers the same talk! I would be interested in purchasing a tool that would do that for me though, are we allowed to discuss costs here or does that need to be taken somewhere else?

Report · Apr 17, 2025

That's more of a private discussion. Send me a PM, please.

Report · Nov 08, 2024

Hi there! Sorry to bother you again, your code works perfectly. If possible, I have another request: would it be possible to keep the original page label within the files that have been split? For example, a PDF file containing pages 1-10, once it's split, if I open page 5, for instance, the page label is not 5, but it's 1. Is there a way to keep the original label, so in this case, the 5? Thank you so much.

Report · Nov 08, 2024

Not directly. You will need to open each file and re-apply the pages label scheme to do it.

Report · Nov 08, 2024

Thanks!

Report · Apr 09, 2025

@try67 Please allow me to step in here and ask for assistance. It is not my thread, but it is more or less about the same. I have a few hundred relatively small PDF files, each with some 20 named pages. Need to extract two pages, named: Group1 and Group2
After extracting I would like to have these pages saved using the original file name with page label.

D:\SourceFiles
e.g. SourceFile123.pdf

SourceFileABC.pdf

D:\ExtractedPages
SourceFile123-group1.pdf
SourceFile123-group2.pdf
SourceFileABC-group1.pdf
SourceFileABC-group2.pdf

As these pages usually are at the 3rd and 4th page I first tried using

this.extractPages(1, 4, "/D/FirstPages/ from "+this.documentFileName+ this.getPageLabel);

(but nothing ends up in D:\FirstPages)

On some other site I found a script that would extract pages based on a specific word.
However, the pages are named with crypticnames.tmp.pdf (i.e. they are not saved using the source file name)

All in all, I can't get it to work.

Any suggestions?

Report · Apr 10, 2025

I don't quite follow...

- Do you want to save them in D:\ExtractedPages or D:\FirstPages? Either way, you have to make sure that folder exists before running your code. It won't be created by the script.

- Where is the Group1/Group2 info coming from?

- Are you running this code in an Action? From the Console? Something else?

- Your code contains multiple errors, and I don't understand which page label you're trying to use...

Report · Apr 10, 2025

Thanks for replying. The folder, it does not matter. WHen I started trying to get this done, I based myself on your reply, June 18, 2020 in
Solved: Re: batch processing acrobat extract first page wi... - Adobe Community - 11218438
(there is where the 'first pages' folder came from. the folder does exist on my drive)

Tried in Action Wizard

SnagIt-10042025 074709.png

SnagIt-10042025 101254.png

Files to be saved using their source file name and page label.

hope all clear.

the pdf's has about 20 pages, there are two pages, group1 and group2 (usually page 3+4 of the pdf)
and I want these 2 pages extracted.

Thanks.

Report · Apr 10, 2025

Page labels and bookmarks are not the same.

Report · Apr 10, 2025

Oops...! That I did not know to be honest.
Ahum.
It does show I am not an expert...
(was convinced they were the same)

Report · Apr 10, 2025

No, quite different things. But if the bookmarks always have the same name and point to the same pages, then why do you need to use them anyway? Or is that not the case? If not, this will require a more complex script, especially since they are not top-level bookmarks.

Report · Apr 10, 2025

Okay, thanks. Bad luck, so be it.
In that case I need to manually go thru close to 400 pdfs - select all pages (52) - deselect page 4 and 5 - delete - save
Thanks again.

Report · Apr 10, 2025

Why manually? This can totally be automated...

Report · Apr 10, 2025

I would not know how, sorry. Have given up searching.

In case doing it manually: open (a number of) file(s) - show thumbnails of the pages - ctrl-a

deselect page 4+5, delete, save, next file...

Report · Apr 10, 2025

You can extract the two pages with extractPages.

JS to Extract Multiple Pages and Name as Page Label

Photos