Copy link to clipboard
Copied
Quick summary: I've got a script as part of a process involving a large Data Merge based document that creates hyperlinks using doc.hyperlinkURLDestinations.add(). The script simply turns text snippets into clickable hyperlinks.
The problem is, the final PDFs that emerge from this process are 300% the size of a PDF without the script being run, and are very slow to process. Most of this extra weight comes from a crazy 740,000% increase (!!!) in the amount of data used for "Structure info" (and that's after running the version with hyperlinks through the Acrobat PDF optimiser).
The amount of structure data in the PDFs with hyperlinks ends up being more than twice the entire file size of the PDFs without added hyperlinks. There's also a substantial increase in the size of the "Cross Reference Table".
This big increase in weight turns a simple, fast process that takes a few minutes into a torturously slow one that takes hours, creating bloated files at a painfully slow rate in a process that is prone to crashing and failing.
I'm hoping for another way to create hyperlinks that doesn't add all this weight.
--------------------------------
Detail: Here's my current process. It works fine except step 2 causes final PDF file size and the time required to run other processes to increase massively. I know there are different ways of implementing hyperlinks in a PDF - I'm hoping there's a different approach I can use at step 2 that avoids this massive bloat.
The goal is to create 600 2-page PDFs from one indesign template using data merge, with each PDF having a filename that reflects the records and live hyperlinks that vary from record to record. Here's my current process (for benchmarking, the machine is a Mac Pro, Lion, CS6, with 6gbs RAM, and all files are on a local HDD, no data transfer over any networks):
Each 2-page PDF has 14 or 15 hyperlinks. I don't understand how 14-15 hyperlinks can result in an extra 2mbs of "" and something like 3500% more processing time to create each PDF.
Can anyone suggest any alterations to the script at step three that might avoid all these overheads? Here's the full script for convenience:
app.findGrepPreferences = app.changeGrepPreferences = null;
var doc = app.activeDocument;
app.findGrepPreferences.findWhat = '(http://.*$|https://.*$)';
var objs = doc.findGrep();
for (var i = 0; i < objs.length; i++) {
var currTarget = objs;
var lnkDest = doc.hyperlinkURLDestinations.add(currTarget.texts[0].contents);
var lnkSrc = doc.hyperlinkTextSources.add(currTarget);
var lnk = doc.hyperlinks.add(lnkSrc, lnkDest);
}
alert('Processed '+objs.length+' hyperlinks');
Edit - here's a side-by-side comparison of the Acrobat PDF optimiser 'Audit Space Usage' tool, showing where the hyperlinks add bulk. I've marked the two big increases in red...
...so there's a massive, insane increase in the size of the "Structure info", from a tiny 0.000238 mbs to 1.76 mbs - two and a half times the size of the entire original file, just on "Structure info"! That's a 740,000% increase...
There's also a big fat increase in the cross reference table, from 0.0046 mbs to 0.427 mbs - the cross reference table in the PDF with hyperlinks is more than half the size of the original file.
The only differences between the two PDFs is, one has 14 clickable hyperlinks attached to existing snippets of text (and, the 'with hyperlinks' one is from a PDF that's been through aggressive PDF optimisation, hence the images are much smaller).
So it turned out the source of the problem was document tags.
Make sure the PDF isn't created as a tagged PDF by InDesign, and it behaves. I don't know what it is about the hyperlinks and tagging that causes it to inflate massively, but it does.
Copy link to clipboard
Copied
What about working with smaller sections:
Create 2-pages-indd-docs by DataMerge (DataMerge is scriptable too) x 600, add hyperlinks six hundred times and export to pdf six hundred times ... (could all be done using one script once)
Hans-Gerd Claßen
Copy link to clipboard
Copied
Interesting idea, thanks. Can you give any example code or links on how to script the Data Merge tool? I've tried looking in the CS6 InDesign javascript scripting guide and there are no occurences of "data merge" at all (nothing matching "data m*" even)
Copy link to clipboard
Copied
simple example:
#target indesign
var destFolderPath = Folder.selectDialog('DestFolder').absoluteURI + '/'
var currDoc = app.activeDocument;//already prepared document (csv connected) is open and frontmost
currDoc.dataMergeOptions.createNewDocument = true;
var maxRange = currDoc.dataMergeProperties.dataMergePreferences.recordRange.split('-')[1];//count of recordRanges
//one file for each record
for(var i = 0; i < maxRange; i++)
{
with(currDoc.dataMergeProperties.dataMergePreferences)
{
recordSelection = RecordSelection.ONE_RECORD;
recordNumber = i+1;
}
currDoc.dataMergeProperties.mergeRecords();
app.activeDocument.save(File(destFolderPath +(i+1) + '.indd'));
app.activeDocument.close();
}
Copy link to clipboard
Copied
So it turned out the source of the problem was document tags.
Make sure the PDF isn't created as a tagged PDF by InDesign, and it behaves. I don't know what it is about the hyperlinks and tagging that causes it to inflate massively, but it does.