Copy link to clipboard
Copied
This is PDFs produced by the Perl library PDF::Builder, not the Acrobat SDK. It appears to be a problem with Acrobat Reader. Apologies, and feel free to move this thread to somewhere you feel is more appropriate.
When using the free Acrobat Reader (Windows 10, automatically updated), I see something odd. Acrobat Reader is the only PDF reader I have access to that appears to support "page labels", which is a formatted page number (not the physical page number) shown on the slider thumb and the page number area. It works great if I have other formats defined, such as /R Roman, /r roman, /A Alpha, and /a alpha, as well as using a /P prefix. Decimal numbers work OK, but only if there are also other formats defined for the document. If I have only /PageLabels << /Nums [ 0 << /S /D >> ] >>, requesting pages to be labeled 1, 2, 3,..., it appears to be ignored. I get the default labels of "Page m of n" on the slider thumb. I do see decimal numbers if I have at least one other format defined: /PageLabels << /Nums [ 0 << /S /r >> 4 << /S /D /St 1 >> ] >> shows pages labeled i, ii, iii, iv, 1, 2, 3,... (this also works with Decimal as the first pages). With just decimals requested, with or without /St (starting page number to show), it looks like page labels are ignored. Am I using page labels incorrectly here? See 12.4.2 in the PDF 1.7 ISO/Adobe 32000-1:2008 document.
Oka-a-a-a-ay. If I have an /St entry anywhere, it shows the expected decimal number (1, 2, 3,...). So, << /Nums [ 0 << /S /D >> ] >> shows "Page 1 of 4" etc., while << /Nums [ 0 << /S /D /St 1 >> ] >> shows "1" etc. /St is supposed to be optional, so I would call this a minor bug in Reader, though it's easy enough to work around (add a start value).
Copy link to clipboard
Copied
Are there any free downloadable or online diagnostic tools for PDFs? Doesn't have to be from Adobe, although that would be preferable. Or, are there diagnostics within Reader? There are plenty of tools to "repair" a PDF, but what I need is something to tell me what's wrong with a PDF, so I can fix the software that produced the PDF in the first place. It's useless to have a repair tool simply hand me a "repaired" PDF, because the fix is usually a complete rewrite of the PDF, and I can't tell what it fixed! Plus, if I can diagnose problems myself, I won't have to bother the Community with questions.
In this particular issue, I don't get any error dialog, but in other issues I do.
Copy link to clipboard
Copied
You can use the PDFDebugger tool that comes with the free (and open-source) Java library PDFBox to examine the contents of the file. It's not a tool that analyzes or fixes files, but it's very handy when you want to locate an issue by comparing a bad file to a good one.
Copy link to clipboard
Copied
Thank you for the suggestion (PDFBox's PDFDebugger). However, I already have a number of tools to dump/display/examine a PDF. What I'm really in need of is something to analyze a PDF and tell me where there are non-conforming items or other potential issues. For example, warning me about having 'save' and 'restore' commands in a text context -- this is usually OK, but with compression on and certain object sizes, AAR barfs on it. Even with that, there's no guarantee that one or another PDF Reader doesn't have its own set of bugs! (such as here, where /St is apparently required, even though it's supposed to be optional).
Copy link to clipboard
Copied
There's about a million different things that can go wrong in a PDF file... Creating such a tool would be a monumental task.
Copy link to clipboard
Copied
While it's certainly not a trivial task, I think it would be possible to at least hit the most common problems. It could check that appropriate objects exist for all object references, that the overall structure seems to be OK (including no duplicate object numbers), that values are within allowable ranges, that no required parameters are missing from an object, that optional parameters are appropriate and not conflicting, that streams contain only the appropriate operators (e.g., no q or Q in a text context), and that no item is for a higher PDF version than what is declared. That ought to take care of a lot of common errors. More edge and corner cases could be added over time. For example, a warning could be given for dead code in a stream (such as a font declaration that is overridden by another font declaration before any text is output).
Compilers do a lot of even more complex checking and validation than this, and have been around for a very long time, so I don't think it's an insurmountable task. To do a good job at this would take a fair amount of work, but would be a valuable asset. It might even help to point out bugs in various Readers (a PDF doesn't display properly, yet passes validation).
Copy link to clipboard
Copied
By the way, I think this page labeling was working in older releases of Adobe Acrobat Reader, but is now failing (as described above) in more recent updates. I no longer have the older Readers, so I can't check for sure.
I hope that Adobe hasn't decided to gradually withdraw features from the free Reader and put them into a paid version. That would be underhanded, and acting like Microsoft. Hopefully it's just an honest bug in newer releases!
Copy link to clipboard
Copied
Oka-a-a-a-ay. If I have an /St entry anywhere, it shows the expected decimal number (1, 2, 3,...). So, << /Nums [ 0 << /S /D >> ] >> shows "Page 1 of 4" etc., while << /Nums [ 0 << /S /D /St 1 >> ] >> shows "1" etc. /St is supposed to be optional, so I would call this a minor bug in Reader, though it's easy enough to work around (add a start value).