How can I get a total page count of a PDF before placing every page in the PDF?
Copy link to clipboard
Copied
I'd like to create a script that places a multiPage PDF in any number of different impositions. (saddle stitch, 4up signatures, 16up signatures etc.)
I've easily created a script that iterates through a PDF and places it into a new document as (4,1|2,3), (8,5|6,7) etc, which works for printing in duplex, folding each page in half, and gluing the resulting spines to make a simple thick book (for PDFs with more than, say, 64 pages).
However, the next step is to re-write the script to create a saddle stitch document (16,1|2,15), (14,3|4,13) ... (10,7|8,9). For this I need to know how many pages there are in the PDF before I start placing the PDF pages, and then making the new document [int((PDFpages+3)/4)] pages long.
Is there a simple way to get the count of PDFpages without going through a loop and placing the next page from the PDF until the placed page number equals the first page number?
This way seems wasteful:
var totPDFPages = 1;
app.pdfPlacePreferences.pageNumber = totPDFPages;
myPDFPage = sheetFront.place(File(srcPDF), [0,0])[0];
var pdfFirstPage = myPDFPage.pdfAttributes.pageNumber;
while (doneCounting == false) {
totPDFPages += 1;app.pdfPlacePreferences.pageNumber = totPDFPages;
myPDFPage = sheetFront.place(File(srcPDF), [0,0])[0];
if (myPDFPage.pdfAttributes.pageNumber == pdfFirstPage) {
totPDFPages -=1;
doneCounting = true;
alert("PDF has " + totPDFPages + " pages!");exit();
};
myPDFPage.remove();
};
NB. Javascript above *hasn't* been run, but should look similar once debugged.
The only thing I've though of to relieve the sheer duplication of placing the PDF twice (once for the count, and once for the imposition), is to create an array of impoPages[counter]=myPDFPage, and then shuffle the pages referenced by the array to the correct sheet and position.
It'd be much easier to be able to assign pageCount = File(srcPDF).pageCount !!!
Thanks for any help/tips or even a simple "What are you smoking, man!?"
Cheers,
Jezz
Copy link to clipboard
Copied
easiest way - ask user to type number of pages in PDF to impose 😉 I do this in my tool ;)
but if you really need to do this automagically - try to place pages, for example, for step = 100 ?
place 100 - if placed (no error) place 200, ...
if after placing 300 you have error - place 250, 275, 262, ...
divide range when error - 100 / 50 / 25 / 12+13 / 6 (6+7) / 3 (3+4) / 1 (2+1)
or try to open PDF in Acrobat
robin
--
www.adobescripts.com
Copy link to clipboard
Copied
Copy link to clipboard
Copied
pp = get_number_of_pages ( File('/d/romani/18-1/rs-18-1.pdf'))
function get_number_of_pages (f)
{
if (f.exists)
{
f.open ('r');
next_line = f.readln ();
while ( next_line.indexOf ('/N ') < 0 )
next_line = f.readln ();
var p = next_line.match (/\/N (\d+)\/T/)[1]
f.close ()
return Number(p)
}
else
{
alert (f.name + ' does not exist.')
exit()
}
}
Peter
Copy link to clipboard
Copied
Copy link to clipboard
Copied
> Wicked idea, Peter. I like it :)
>
I second that. Very nice!
Harbs
Copy link to clipboard
Copied
Copy link to clipboard
Copied
Copy link to clipboard
Copied
--Get result from Spotlight
set myResult to do shell script "mdls -name kMDItemNumberOfPages '" & POSIX path of myFile & "'"
--Split result on space equals space and grab the second word,
-- which should be the page count
set oldDelims to AppleScript's text item delimiters
set AppleScript's text item delimiters to " = "
set myPagesCount to second text item of myResult
set AppleScript's text item delimiters to oldDelims
As near as I can tell, Spotlight is responsible for all that info in the "More Info" section of the Get Info panel in the Finder.
Copy link to clipboard
Copied
Thank you, Eric for the solution in AS.
Although the post is already more than a year old, I just happen to look for it recently.
Tks!
Copy link to clipboard
Copied
I'm very comfortable with GREP in BBEdit, but really need to expand that into javascript.
That snippet of javascript to count the number of pages in a PDF is the sort of lateral thinking I love!
Once again, Peter, you're a marvel.
--Jezz
Copy link to clipboard
Copied
I found that your method doesn't always work as it is...
The header from "InDesign CS2 Scripting Reference.pdf" looks like this:
%PDF-1.5
%
1 0 obj<</Contents 2 0 R/Type/Page/Parent 285639 0 R/Rotate 0/MediaBox[0.0 0.0 612.0 792.0]/CropBox[0.0 0.0 612.0 792.0]/BleedBox[0.0 0.0 612.0 792.0]/TrimBox[0.0 0.0 612.0 792.0]/ArtBox[0.0 0.0 612.0 792.0]/Resources<</Font<</C0_0 5899 0 R/C0_1 286131 0 R>>/ProcSet[/PDF/Text]/ExtGState<</GS0 286123 0 R>>>>/StructParents 4>>
endobj
2 0 obj<</Length 3252/Filter/FlateDecode>>stream
However, about three quarters through the PDF there appears a line:
285635 0 obj<</Count 1928/Kids[285636 0 R 285792 0 R 285948 0 R]/Type/Pages>>
It's not the only /Count object, but it *is* the only occurence of a line containing "/Pages>>", so it's not all doom and gloom. :)
totPDFPages = getPDFPageCount(File.openDialog("Choose a PDF File"));
function getPDFPageCount(f) {
f.open ('r');
var gotCount = false;
while (! gotCount) {
try {next_line = f.readln();}
catch (myError) {
alert("We've got an error '"+myError+"\pAborting the script");
exit();
}
if (next_line.indexOf ('/N ') > 0) { // We've got the easy sort of PDF
var p = next_line.match (/\/N (\d+)\/T/)[1];
alert("Found a '/N' style PDF, with "+p+" pages");
gotCount = true;
}
else if (next_line.indexOf ('/Pages>>') > 0 ) { // We probably had to read nearly to the end of the file for the match...
var p = next_line.match (/\/Count (\d+)\/K/)[1];
alert("Found a '/Count ... /Pages>>' style PDF, with "+p+" pages");
gotCount = true;
}
}
f.close ();
return Number(p);
}
--Jezz
Copy link to clipboard
Copied
My try{} catch{} doesn't abort the script if it doesn't find either of the matches, and appears to just keep on trying to read. (endless loop 😞 )
I need to fix for reading past the end of the file.
--Jezz
Copy link to clipboard
Copied
function getPDFPageCount(f) {
f.open ('r');
var gotCount = false;
while (! gotCount) {
next_line = f.readln();
if ( f.eof ) {alert("Aborting the script\nWe've got to the end of the file without finding a page count");
f.close();
exit();
}
if (next_line.indexOf ('/N ') > 0) { // We've got the easy sort of PDF
var p = next_line.match (/\/N (\d+)\/T/)[1];
alert("Found a '/N' style PDF, with "+p+" pages");
gotCount = true;
}
else if (next_line.indexOf ('/Pages>>') > 0 ) { // We probably had to read nearly to the end of the file for the match...
var p = next_line.match (/\/Count (\d+)\/K/)[1];
alert("Found a '/Count ... /Pages>>' style PDF, with "+p+" pages");
gotCount = true;
}
}
f.close ();
return Number(p);
}
Copy link to clipboard
Copied
Thanks,
Peter
Copy link to clipboard
Copied
jxswm
///////////////////////////
function getPDFPageCount(f){
if(f.alias){f = f.resolve();}
if(f == null){return -1;}
if(f.hidden){f.hidden = false;}
if(f.readonly){f.readonly = false;}
f = new File(f.fsName);
f.encoding = "Binary";
if(!f.open("r","TEXT","R*ch")){return -1;}
f.seek(0, 0); var str = f.read(); f.close();
if(!str){return -1;}
//f = new File(Folder.temp+"/123.TXT");
//writeFile(f, str.toSource()); f.execute();
var ix, _ix, lim, ps;
ix = str.indexOf("/N ");
if(ix == -1){
var src = str.toSource();
_ix = src.indexOf("<< /Type /Pages /Kids [");
if(_ix == -1){
ps = src.match(/<<\/Count (\d+)\/Type\/Pages\/Kids\[/);
if(ps == null){
ps = src.match(/obj <<\\n\/Type \/Pages\\n\/Count (\d+)\\n\/Kids \[/);
if(ps == null){
ps = src.match(/obj\\n<<\\n\/Type \/Pages\\n\/Kids \[.+\]\\n\/Count (\d+)\\n\//);
if(ps == null){return -1;}
lim = parseInt(ps[1]);
if(isNaN(lim)){return -1;}
return lim;
}
lim = parseInt(ps[1]);
if(isNaN(lim)){return -1;}
return lim;
}
lim = parseInt(ps[1]);
if(isNaN(lim)){return -1;}
return lim;
}
ix = src.indexOf("] /Count ", _ix);
if(ix == -1){return -1;}
_ix = src.indexOf(">>", ix);
if(_ix == -1){return -1;}
lim = parseInt(src.substring(ix+9, _ix));
if(isNaN(lim)){return -1;}
return lim;
}
_ix = str.indexOf("/T", ix);
if(_ix == -1){
ps = str.match(/<<\/Count (\d+)\/Type\/Pages\/Kids\[/);
if(ps == null){return -1;}
lim = parseInt(ps[1]);
if(isNaN(lim)){return -1;}
return lim;
}
lim = parseInt(str.substring(ix+3, _ix));
if(isNaN(lim)){return -1;}
return lim;
}
Copy link to clipboard
Copied
Awesome and fast. Thanks for share!!!
Copy link to clipboard
Copied
For Mac the spotlight method will not work if the files are on a remote server / drive.
See https://macscripter.net/viewtopic.php?id=32381 for some methods.
For Windows, IF YOU HAVE ACROBAT INSTALLED, nowadays quite common with CC you can use this.
// For Windows only
function pagesInPDF(path) {
if ($.os[0] !== "W") {return 'Jerk';} // For Windows only!
var vbs;
vbs = [
'Set doc = CreateObject("AcroExch.PDDoc")',
'path = "' + new File(path).fsName + '"',
'doc.open path',
'returnValue = doc.GetNumPages',
'doc.Close()'
].join('\n');
// Thanks to Harbs for returnValue https://forums.adobe.com/message/8152525#8152525
return app.doScript(vbs, ScriptLanguage.VISUAL_BASIC);
// -1 means doc couldn't be opened
}
// Change to correct path
// pagesInPDF("C:\\Users\\Trevor\\Creative Cloud Files\\PDF\\dest.pdf")
Copy link to clipboard
Copied
Hi Trevor,
below a solution that is based purely on InDesign, but that obviously can be very much improved with a devide-&-conquer-method for picking numbers. An early version of the below approach can be found here:
Re: Number of pages in imported PDF file?
This time I impoved the code a bit using a temp document's placeGun to count PDF pages and not a temp graphic frame.
The method exploits the fact, that the value for pdfAttributes.pageNumber of a placed PDF will be set to 1 if you want to place a page number that is not possible. For example if you want to place page 21 of a PDF with 20 pages only. InDesign will not throw an error on such an attempt, but will simply place page 1.
Here the code that is working on a graphic frame selected that is holding a placed PDF. You may test with a selected PDF file as well, the function takes a PDF file as argument. But be aware that there is no error handling with the given argument here.
/*
getNumberOfPagesInPDF_FUNCTION-v2.jsx
Script by Uwe Laubender
Using the placeGun to count the number of pages in a PDF.
This is a VERY SLOW version and can be improved very much with a "Divide & Conquer" method.
A first approach can be found in this thread:
2. Re: Number of pages in imported PDF file?
Uwe Laubender Oct 16, 2016 10:22 AM (in response to dev9togo)
https://forums.adobe.com/message/9070532#9070532
*/
// Use case with a selected frame where a PDF is placed into.
// Select the frame and run the script snippet.
var graphicFrame = app.selection[0];
var placedPDFGraphic = graphicFrame.graphics[0];
var pdfFile = File( placedPDFGraphic.itemLink.filePath );
var numberOfPagesOfPDF = getNumberOfPagesInPDF( pdfFile );
alert( "Number of Pages of Placed PDF: "+numberOfPagesOfPDF );
function getNumberOfPagesInPDF( pdfFile )
{
var maxNumber = 1 ;
var tempDoc = app.documents.add( false );
app.pdfPlacePreferences.pageNumber = maxNumber + 1 ;
tempDoc.placeGuns.abortPlaceGun();
tempDoc.placeGuns.loadPlaceGun( pdfFile , false );
var currentPageNumber = tempDoc.placeGuns.pageItems[0].graphics[0].getElements()[0].pdfAttributes.pageNumber ;
tempDoc.placeGuns.abortPlaceGun();
if( currentPageNumber == 1 )
{
tempDoc.close( SaveOptions.NO ) ;
return maxNumber ;
};
while( currentPageNumber > 1 )
{
maxNumber++ ;
app.pdfPlacePreferences.pageNumber = maxNumber;
tempDoc.placeGuns.abortPlaceGun();
tempDoc.placeGuns.loadPlaceGun( pdfFile , false );
currentPageNumber = tempDoc.placeGuns.pageItems[-1].graphics[0].getElements()[0].pdfAttributes.pageNumber;
if( currentPageNumber > maxNumber )
{ maxNumber = currentPageNumber };
tempDoc.placeGuns.abortPlaceGun();
};
tempDoc.close( SaveOptions.NO );
return maxNumber-1 ;
};
Regards,
Uwe
Copy link to clipboard
Copied
Hi Uwe,
I think it's really better to avoid InDesign's DOM stuff if possible (in some cases it's not).
I tested your method on a 22 page pdf and it took 10329ms my method took 145ms that's 70+ times quicker even on a smallish document.
Even if you divide and conquer then it's still going to be very slow in comparison. For a 3 page doc it still took more than 2 seconds.
One can easily need to paste 200+ page pdfs that's going to take a pretty unbearable amount of time using InDesign DOM. With the acrobat method it should make hardly any difference in the timing.
Regards
Trevor
Copy link to clipboard
Copied
Hi Trevor,
of course you are right, if you assume that Acrobat Pro DC is available.
My thought on this is to provide this as "last resort" if anything goes wrong with Acrobat or Acrobat is not installed.
Question is how the number of pages in a PDF could be calculated quickly when using InDesign Server.
Regards,
Uwe
Copy link to clipboard
Copied
If you can afford InDesign Sever you can afford Acrobat . Or maybe the server can't do doScript, do you know, I don't?
If that wasn't an option I would still prefer some file reading method, a variation on one of the above.
But I get the point.
Copy link to clipboard
Copied
HI,
If you are on the server then Acrobat is not licensed for server use.
Regards
Malcolm
Copy link to clipboard
Copied
Are you saying that it goes against the Acrobat EULA? The Server EULA or something else?
Copy link to clipboard
Copied
HI,
Yes, it is against the Acrobat EULA.
Regards
Malcolm
-
- 1
- 2