Copy link to clipboard
Copied
HI All:
I am currently working on pdf files for security research purpose. I manually try some pdf samples, and found that I can change some field in pdf files, and the pdf file can still be rendered normally. I modify the the number before %EOF, which I believe is the offset for XREF table. I also modify the last two entries within the XREF table.
The modified version may look like this (number in red is the modified one):

When I try to open the modify the pdf file, the file can be normally rendered. I try this test on Ubuntu-14.04/PDF-1.4/Adobe-9 and WIN7-64/PDF-1.4/Adobe-8.1.
On Ubuntu, I just see one warning beforehand: "The file is damaged but is being repaired.", and the pdf file is normally rendered. And for WIN7 case, the modified pdf is just normally rendered. I am curious how does it happen? Is this due to repair mechanism built in adobe reader? If this is the case, will pdf files (with corrupted XREF table or without XREF table) be rendered normally?
Thanks
Yes, Reader/Acrobat are able to detect and repair some types of errors. The type you're talking about are can be simple to fix, but it would be a mistake to intentionally distribute documents that you know are damaged. Even if Reader/Acrobat were able to correct bad XREF tables in all cases, there are many PDF viewers/consumers out there that probably can't.
Copy link to clipboard
Copied
Yes, Reader/Acrobat are able to detect and repair some types of errors. The type you're talking about are can be simple to fix, but it would be a mistake to intentionally distribute documents that you know are damaged. Even if Reader/Acrobat were able to correct bad XREF tables in all cases, there are many PDF viewers/consumers out there that probably can't.
Copy link to clipboard
Copied
HI George:
Could you give me a brief explanation how does pdf reader fix the file based on a corrupted XREF table? By extracting logical structure from the root object?
Thanks
Copy link to clipboard
Copied
I don't have the details or how Acrobat/Reader work, but it's not difficult to rebuild an XREF table if you're able to read the file so you can determine where all of the object are. The PDF specification (ISO-32000) has more information: PDF Reference and Adobe Extensions to the PDF Specification | Adobe Developer Connection
Find more inspiration, events, and resources on the new Adobe Community
Explore Now