powershell -- using the Acrobat COM object to access metadata

Report · Nov 29, 2016

I have related posts at powershell -- iterate over a collection of PDFs looking for those with specific info.subject and at the TechNet Powershell forum powershell -- iterate over a collection of PDFs looking for those with specific info.subject

A user on the TechNet forum suggested using the Acrobat COM object and posting here. Is there any documentation about using Acrobat.dll as a COM object (hopefully with examples)?

Again, my goal is to be able to iterate over a collection of PDFs looking for those with specific values stored in the metadata.

I work at a DoD site, so it's very difficult to get any 3rd-party software approved, but if I can use Acrobat.dll as a COM object, hopefully I won't need any 3rd-party software.

As you can see in the screenshot below, someone has figured out how to access PDF metadata from the OS level, presumably without any additional software.

Thanks,

Christian Bahnsen

Report · Nov 29, 2016

The Acrobat SDK has full documentation on the developer interfaces to Acrobat. Some of them might involve acrobat.dll but they rely on Acrobat to do this work. Notes: not in the free Reader. Not for a server. Not for a service.

Report · Nov 30, 2016

I find the SDK to be pretty thin on most topics.

I'm dismayed at the dearth of hard knowledge and practical examples of extracting metadata from Acrobat documents, regardless of the method. I can't be the first person to see the value of being able to select PDFs from a collection based on values stored in metadata. You'd think Adobe would tout this feature/capability and fully support it; I'm not getting that feeling at all. Thanks to those of you giving suggestions, but where's Adobe Tech Support on this? C'mon, Adobe, get with the program!

Report · Nov 30, 2016

The SDK has thousands of pages of documentation, but not many examples. Reading the docs is a must. You might also check out the XMP toolkit for parsing the XML extracted. But I believe there may be methods for metadata in JavaScript. I know there are for plug-ins in C/C++. There is also the PDF spec if you don't want to use a library (XMP spec also needed).

If you want tech support as a programmer, a developer case is $200 I think. If you find out how to raise a case let us know.

powershell -- using the Acrobat COM object to access metadata

Photos