New Participant

Answered

Automatically export PDFs broken into H1, then H2, then H3 sections

Forum|Forum|2 years ago
May 26, 2023
5 replies
1251 views

Hi everyone, I have a layout our company frequently uses with H1, H2, H3 sections. It gets exported as a (1) whole document, (2) broken up by H1 sections, (3) then by H2 sections, where H1 has to be deleted (or hidden?) and H2 needs to become automatic start of page. Then sometimes also by H3, where H1 and H2 both need to be hidden and H3 becomes the automatic start of the page.

We have hundreds of these documents that all use the same layout. Is there a way to automate this so I could click a button and get PDF exports of:

(1) Whole document

(2) Each H1 section

(3) Each H2 section as a new document break, layout automatically reflowing so H2 is at the start of the page, with H1's hidden

(3) Each H3 section as a new document break, layout automatically reflowing so H3 is at the start of the page, with H1's and H2's hidden

Bonus if there would be a way to make the PDF names all automatically incorporate the heading level.

In case not clear enough. A simple example would be a document with the following sections:

H1.1

- H2.1

-- H3.1

-- H3.2

- H2.2

-- H3.3

-- H3.4

H1.2

- H2.3

-- H3.5

-- H3.6

- H2.4

-- H3.7

-- H3.8

I am looking to press a button and get the following categories of PDFs:

PDFs broken by H1:

H1.1

- H2.1

-- H3.1

-- H3.2

- H2.2

-- H3.3

-- H3.4

PDFs broken by H2:

- H2.1

-- H3.1

-- H3.2

- H2.2

-- H3.3

-- H3.4

PDFs broken by H3:

-- H3.1

-- H3.2

etc.....

Thank you so much 🙂

This topic has been closed for replies.

Correct answer m1b

Hi @default8so4bus0pmvq, I've had a try to make a script to do what you describe. I've made it work with your supplied sample file. It is a little bit complex to understand in some places, but hopefully you will be able to see, at least where I have put comments in the code.

There is one important thing that you need to change in your document however: you need to put your text story into the Primary Text Frame. You can set up a Primary Text Frame by going to your master page (the one with the main story sized text frame), selecting it, and then clicking the icon that appears near the top left of the text frame (hover your cursor over the icon to read what it will do). It is very useful to use a Primary Text Frame in this case because it means that when we set the paragraph style "startParagraph" setting to Next Page, it will automatically add pages to accommodate.

Here is the script:

/**
 * Break Document By Paragraph Styles
 * Important requirements:
 *   1. Paragraph Styles should be in a single Story.
 *   2. Story is in the document's Primary Text Frame.
 * @author m1b
 * @discussion https://community.adobe.com/t5/indesign-discussions/automatically-export-pdfs-broken-into-h1-then-h2-then-h3-sections/m-p/13822018
 */
function main() {

    var settings = {
        breakStyleNames: ['Heading 1', 'Heading 2', 'Heading 3'],
        pdfPresetName: '[Press Quality]',
    }

    if (app.documents.length == 0) {
        alert('Please open the master document and try again.');
        return;
    }

    var doc = app.activeDocument,
        path = doc.fullName.fsName,
        paragraphStyles = [];

    // get the paragraph styles
    for (var i = 0; i < settings.breakStyleNames.length; i++) {
        var paragraphStyle = getByName(doc.allParagraphStyles, settings.breakStyleNames[i]);
        if (
            paragraphStyle != undefined
            && paragraphStyle.isValid
        )
            paragraphStyles.push(paragraphStyle);
    }

    // we want document to automatically
    // add extra pages when the text overflows
    doc.textPreferences.addPages = AddPageOptions.END_OF_STORY;

    // for each of the paragraph styles:
    for (var i = 0; i < paragraphStyles.length; i++) {

        var paragraphStyle = paragraphStyles[i];

        // save as new document with style name
        var exportPath = path.replace(/(\.[^\.]+)$/, '_' + paragraphStyle.name + '_##_$1');

        // set current paragraph style to start on next page
        paragraphStyle.startParagraph = StartParagraph.NEXT_PAGE;
        doc.recompose();

        // find the paragraphs
        var found = findParagraphStyle(doc, paragraphStyle);

        // remove blank pages from end of document
        var frames = found[0].parentStory.textContainers;
        for (var j = frames.length - 1; j >= 0; j--)
            if (frames[j].contents == '')
                frames[j].parentPage.remove();

        // save as new document
        doc = doc.save(File(exportPath.replace('_##', '')));

        // get the pages of the found text
        var pages = [];
        for (var f = 0; f < found.length; f++)
            if (found[f].parentTextFrames[0].parentPage.isValid)
                pages.push(found[f].parentTextFrames[0].parentPage);

        if (pages.length == 0)
            continue;

        // for each found page
        for (var p1, p2, p = 1; p <= pages.length; p++) {

            // get the first and last page offset of this range
            p1 = pages[p - 1].documentOffset;
            p2 = (p < pages.length)
                ? p2 = pages[p].documentOffset - 1
                : p2 = doc.pages[-1].documentOffset;

            // convert to a string of page names
            var rangeString = doc.pages.itemByRange(p1, p2).name;

            // export the document
            exportDocumentAsPDF(doc, exportPath.replace('##', ('0' + p).slice(-2)), settings.pdfPresetName, rangeString);

        }

        // remove all text using this style
        for (var f = found.length - 1; f >= 0; f--)
            found[f].remove();

    }

    doc.close(SaveOptions.NO);

}

app.doScript(main, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, 'Break Document By Paragraph Styles');


/**
 * Find text set in a paragraph style.
 * @param {Document|Story|TextFrame} findWhere - the object to search.
 * @param {ParagraphStyle} paragraphStyle - the paragraphStyle to search for.
 * @returns {Array<Text>}
 */
function findParagraphStyle(findWhere, paragraphStyle) {

    if (typeof findWhere.findGrep !== 'function')
        return [];

    app.findGrepPreferences = NothingEnum.NOTHING;
    app.changeGrepPreferences = NothingEnum.NOTHING;
    app.findGrepPreferences.appliedParagraphStyle = paragraphStyle;

    return findWhere.findGrep();

}

/**
 * Returns element of things with matching name.
 * @param {Array<any>} things - the things to search through.
 * @param {String} name - the name to match.
 * @returns {any}
 */
function getByName(things, name) {

    for (var i = 0; i < things.length; i++)
        if (things[i].name == name)
            return things[i];

};


/**
 * Export a document as PDF.
 * @param {Document} doc - an Indesign Document.
 * @param {String} exportPath - the path to the exported file.
 * @param {String} pdfExportPreset - the name of the pdf preset to use.
 * @param {String} pageRange - the range of pages to export.
 */
function exportDocumentAsPDF(doc, exportPath, pdfExportPreset, pageRange) {

    app.pdfExportPreferences.pageRange = String(pageRange);

    exportPath = File(String(exportPath).replace(/\.[^\.]+$/, '.pdf'));

    // create the folder if necessary
    if (!exportPath.parent.exists)
        exportPath.parent.create();

    doc.exportFile(
        ExportFormat.pdfType,
        exportPath,
        false,
        pdfExportPreset
    );

};

Here's what I suggest you do.

1. Put a copy of your "master" document into a new folder (I called it "testing").

2. Open that document.

3. Run script.

This is what I get after running the script:

I hope that will get you moving forward with your project. Let me know how it goes.

- Mark

Edit: attached my slightly-modified version of your sample document, so you can see the primary text frame change.

m1b

Correct answer

Community Expert

Hi @default8so4bus0pmvq, I've had a try to make a script to do what you describe. I've made it work with your supplied sample file. It is a little bit complex to understand in some places, but hopefully you will be able to see, at least where I have put comments in the code.

There is one important thing that you need to change in your document however: you need to put your text story into the Primary Text Frame. You can set up a Primary Text Frame by going to your master page (the one with the main story sized text frame), selecting it, and then clicking the icon that appears near the top left of the text frame (hover your cursor over the icon to read what it will do). It is very useful to use a Primary Text Frame in this case because it means that when we set the paragraph style "startParagraph" setting to Next Page, it will automatically add pages to accommodate.

Here is the script:

/**
 * Break Document By Paragraph Styles
 * Important requirements:
 *   1. Paragraph Styles should be in a single Story.
 *   2. Story is in the document's Primary Text Frame.
 * @author m1b
 * @discussion https://community.adobe.com/t5/indesign-discussions/automatically-export-pdfs-broken-into-h1-then-h2-then-h3-sections/m-p/13822018
 */
function main() {

    var settings = {
        breakStyleNames: ['Heading 1', 'Heading 2', 'Heading 3'],
        pdfPresetName: '[Press Quality]',
    }

    if (app.documents.length == 0) {
        alert('Please open the master document and try again.');
        return;
    }

    var doc = app.activeDocument,
        path = doc.fullName.fsName,
        paragraphStyles = [];

    // get the paragraph styles
    for (var i = 0; i < settings.breakStyleNames.length; i++) {
        var paragraphStyle = getByName(doc.allParagraphStyles, settings.breakStyleNames[i]);
        if (
            paragraphStyle != undefined
            && paragraphStyle.isValid
        )
            paragraphStyles.push(paragraphStyle);
    }

    // we want document to automatically
    // add extra pages when the text overflows
    doc.textPreferences.addPages = AddPageOptions.END_OF_STORY;

    // for each of the paragraph styles:
    for (var i = 0; i < paragraphStyles.length; i++) {

        var paragraphStyle = paragraphStyles[i];

        // save as new document with style name
        var exportPath = path.replace(/(\.[^\.]+)$/, '_' + paragraphStyle.name + '_##_$1');

        // set current paragraph style to start on next page
        paragraphStyle.startParagraph = StartParagraph.NEXT_PAGE;
        doc.recompose();

        // find the paragraphs
        var found = findParagraphStyle(doc, paragraphStyle);

        // remove blank pages from end of document
        var frames = found[0].parentStory.textContainers;
        for (var j = frames.length - 1; j >= 0; j--)
            if (frames[j].contents == '')
                frames[j].parentPage.remove();

        // save as new document
        doc = doc.save(File(exportPath.replace('_##', '')));

        // get the pages of the found text
        var pages = [];
        for (var f = 0; f < found.length; f++)
            if (found[f].parentTextFrames[0].parentPage.isValid)
                pages.push(found[f].parentTextFrames[0].parentPage);

        if (pages.length == 0)
            continue;

        // for each found page
        for (var p1, p2, p = 1; p <= pages.length; p++) {

            // get the first and last page offset of this range
            p1 = pages[p - 1].documentOffset;
            p2 = (p < pages.length)
                ? p2 = pages[p].documentOffset - 1
                : p2 = doc.pages[-1].documentOffset;

            // convert to a string of page names
            var rangeString = doc.pages.itemByRange(p1, p2).name;

            // export the document
            exportDocumentAsPDF(doc, exportPath.replace('##', ('0' + p).slice(-2)), settings.pdfPresetName, rangeString);

        }

        // remove all text using this style
        for (var f = found.length - 1; f >= 0; f--)
            found[f].remove();

    }

    doc.close(SaveOptions.NO);

}

app.doScript(main, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, 'Break Document By Paragraph Styles');


/**
 * Find text set in a paragraph style.
 * @param {Document|Story|TextFrame} findWhere - the object to search.
 * @param {ParagraphStyle} paragraphStyle - the paragraphStyle to search for.
 * @returns {Array<Text>}
 */
function findParagraphStyle(findWhere, paragraphStyle) {

    if (typeof findWhere.findGrep !== 'function')
        return [];

    app.findGrepPreferences = NothingEnum.NOTHING;
    app.changeGrepPreferences = NothingEnum.NOTHING;
    app.findGrepPreferences.appliedParagraphStyle = paragraphStyle;

    return findWhere.findGrep();

}

/**
 * Returns element of things with matching name.
 * @param {Array<any>} things - the things to search through.
 * @param {String} name - the name to match.
 * @returns {any}
 */
function getByName(things, name) {

    for (var i = 0; i < things.length; i++)
        if (things[i].name == name)
            return things[i];

};


/**
 * Export a document as PDF.
 * @param {Document} doc - an Indesign Document.
 * @param {String} exportPath - the path to the exported file.
 * @param {String} pdfExportPreset - the name of the pdf preset to use.
 * @param {String} pageRange - the range of pages to export.
 */
function exportDocumentAsPDF(doc, exportPath, pdfExportPreset, pageRange) {

    app.pdfExportPreferences.pageRange = String(pageRange);

    exportPath = File(String(exportPath).replace(/\.[^\.]+$/, '.pdf'));

    // create the folder if necessary
    if (!exportPath.parent.exists)
        exportPath.parent.create();

    doc.exportFile(
        ExportFormat.pdfType,
        exportPath,
        false,
        pdfExportPreset
    );

};

Here's what I suggest you do.

1. Put a copy of your "master" document into a new folder (I called it "testing").

2. Open that document.

3. Run script.

This is what I get after running the script:

I hope that will get you moving forward with your project. Let me know how it goes.

- Mark

Edit: attached my slightly-modified version of your sample document, so you can see the primary text frame change.

sample_doc.zip

D

default8so4bus0pmvqAuthor

New Participant

Mark, this is amazing!! You made my day. You have no idea how many hours this will save me. Probably hundreds. Thank you so very much!!!!

Best Regards,

Dave

m1b

Community Expert

You're welcome Dave. Hope it goes well. 🙂

- Mark

E

Eugene Tyson

Community Expert

Some great advice already here - you can set them as H1 by style names etc. as already pointed o

Here's a way to export to individual files as you want

https://colecandoo.com/2013/07/13/breaking-up-is-hard-to-do-indesign-files-into-individual-pdfs-that-is/

D

default8so4bus0pmvqAuthor

New Participant

Thanks Eugene. Could someone step through the basic steps that would be required for a script like this?

Christopher Shelton

Participating Frequently

I know the script Eugene provided will target the data based on style names. It may be worth looking into separating the data onto their own respective layers and then changing the visibility accordingly.

m1b

Community Expert

Hi @default8so4bus0pmvq, any chance you could post an example document? You can replace the real text with dummy text if you like. Posting a sample indesign document is a way to make it easier for people to help in cases like this.

- Mark

P.S. also it would be good if the sample document showed how the headings related to pages, for example, does h1 always start on a new page? Can two h1s (or h2s, etc) appear on the same page? If so, then the script would have to add page breaks to the text.

D

default8so4bus0pmvqAuthor

New Participant

Hi m1b. Thanks for responding.

Here is a shortened version of a document. It has two H1's as well as H2's and H3's.

H1 automatically breaks on a new page. H2 and H3 do not. And yes, what I have currently been doing is:

1. Whole Document: Export entire document as PDF

2. H1 Sections: Create table of contents with PDF Bookmarks, only including H1 styles. Export document including bookmarks. Open in Acrobat. Split document using bookmark names for file names.

3. H2 Sections: Delete all H1s. Change setting so each H2 starts on a new page. Repeat above step.

4. H3 Sections: Delete all H2s. Repeat above steps for H3

It feels like this logic should be able to be automated with a script??

Thanks so much...