Skip to main content
Inspiring
July 4, 2025
Question

Auto-tag and Bookmarks

  • July 4, 2025
  • 2 replies
  • 238 views

What does the Auto-tag API do in terms of generating bookmarks?   It seems that some PDFs produce bookmarks based on the structure, and some don't.  In Acrobat, there is an ability to generate bookmarks based on the structure, but you are required to select the structure elements to be used.   The API doesn't provide a way to specify these structures.

 

Thanks.

    2 replies

    Joel Geraci
    Community Expert
    Community Expert
    July 14, 2025

    Auto-Tag doesn't modify the bookmarks at all. 

    You'll have to post-process the PDF using a PDF library tool like PDFBox.

    While PDFBox doesn't have a direct "create bookmarks from tags" function that automatically maps specific tag types (like headings) to bookmarks, you can programmatically achieve this by
    1. Extracting the Tagged Content: You can use the PDFMarkedContentExtractor class in PDFBox to extract the tagged content from a PDF document. You'll iterate through the pages, process each page with the extractor, and then retrieve the marked content list, which will contain information about the tags and their associated text content.
    2. Identifying the Relevant Tags and Text: After extraction, you'll need to examine the extracted PDMarkedContent objects and identify the tags that should correspond to your desired bookmark hierarchy (e.g., <H1> for top-level bookmarks, <H2> for sub-bookmarks, etc.).
    3. Creating the Bookmark Outline: You can then use the PDDocumentOutline and PDOutlineItem classes in PDFBox to build the bookmark structure (also called the document outline) based on the identified tags and their corresponding text. You'll specify the title of each bookmark (typically extracted from the tag's content) and the page it should link to. 
    Inspiring
    August 8, 2025

    Thanks Joel

    Inspiring
    July 9, 2025

    Ping!