Need to parse the pdf to get all object from meta-data.

Forum|Forum|4 years ago
March 1, 2021
4 replies
840 views

I need to parse the meta-data of a given PDF file to get counters of different types of objects contained in a pdf and extract the various object. Say object of type "/JavaScript" or "/ObjStm".

Windows

This topic has been closed for replies.

T

Test Screen Name

Brainiac

The Cos layer is what you get. It can enumerate all actual objects. If this isn't enough for you, Adobe don't have anything else, but there are many PDF libraries out there.

V

VS_NoviceAuthor

Participating Frequently

Thanks!
Wanted to check is there anything other than the COS layer that can help( I may not aware of it).
Or if Acrobat SDK has some added functionality for this as compared to PDFL SDK.

I tried using open source libs, those are good but give some internal logic/ number error for a few malicious pdfs. So thought this is the most reliable one to go with.

T

Test Screen Name

Brainiac

The Cos API gives access to all objects. But not to objstm.

V

VS_NoviceAuthor

Participating Frequently

Yeah, that's one of the cases. And I need to maintain a counter of all kinds of objects even "/JS" and all "/AA". So I need some sort of parser or enumerator.

try67

Adobe Expert

This type of objects are not a part of a file's metadata, but the actual data...

V

VS_NoviceAuthor

Participating Frequently

Yeah, true they should be called structural data of PDF.
What I am trying to do is to extract all the structural objects, based on their type, and categorize those, Basically maintaining a counter of objects in each category.

But couldn't find the right set of APIs or not even sure does the SDK enables us with any such kind of functionality.

V

VS_NoviceAuthor

Participating Frequently

I am trying to do with PDF Library SDK for C++.

Any leads would be really helpful.

Thanks in advance!

Sign up

To post, reply, or follow discussions, please sign in with your Adobe ID.

Sign in to Adobe Community

To post, reply, or follow discussions, please sign in with your Adobe ID.

Scanning file for viruses.

This file cannot be downloaded