Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Auto-tag and Bookmarks

Explorer ,
Jul 04, 2025 Jul 04, 2025

What does the Auto-tag API do in terms of generating bookmarks?   It seems that some PDFs produce bookmarks based on the structure, and some don't.  In Acrobat, there is an ability to generate bookmarks based on the structure, but you are required to select the structure elements to be used.   The API doesn't provide a way to specify these structures.

 

Thanks.

170
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 09, 2025 Jul 09, 2025

Ping!

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 14, 2025 Jul 14, 2025

Auto-Tag doesn't modify the bookmarks at all. 

You'll have to post-process the PDF using a PDF library tool like PDFBox.

While PDFBox doesn't have a direct "create bookmarks from tags" function that automatically maps specific tag types (like headings) to bookmarks, you can programmatically achieve this by
  1. Extracting the Tagged Content: You can use the PDFMarkedContentExtractor class in PDFBox to extract the tagged content from a PDF document. You'll iterate through the pages, process each page with the extractor, and then retrieve the marked content list, which will contain information about the tags and their associated text content.
  2. Identifying the Relevant Tags and Text: After extraction, you'll need to examine the extracted PDMarkedContent objects and identify the tags that should correspond to your desired bookmark hierarchy (e.g., <H1> for top-level bookmarks, <H2> for sub-bookmarks, etc.).
  3. Creating the Bookmark Outline: You can then use the PDDocumentOutline and PDOutlineItem classes in PDFBox to build the bookmark structure (also called the document outline) based on the identified tags and their corresponding text. You'll specify the title of each bookmark (typically extracted from the tag's content) and the page it should link to. 
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 08, 2025 Aug 08, 2025
LATEST

Thanks Joel

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources