Skip to main content
Participant
August 17, 2021
Question

How are the cross references stored in the pdf data structure?

  • August 17, 2021
  • 1 reply
  • 611 views

I am trying to fetch a URI like link that is available for hyperlink pointing to external URLs. However for cross references, such URL link isn't available. From what I have learned by reading across various forums, I have an understanding that cross references store unique ids and maps two indirect objects within the same pdf. But, where are these unique ids stored? How is the cross reference stored and performed in the pdf data structure? I am using python libraries such as pyPDF2 to manipulate the pdf and extract data structures. Some insights and help over this would help me develop over pdfs.

This topic has been closed for replies.

1 reply

Legend
August 17, 2021

Cross reference tables in a PDF are part of its  basic structure, used to find everything inside the PDF file. They are not in any way connected to hyperlinks, or links inside a PDF from one page to another, or document cross references. If reading the PDF Reference, hyperlinks and internal links are Annots and destinations are Actions or Dests.