• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
2

Extracting Metadata from Layers within PDF

Community Beginner ,
Feb 07, 2024 Feb 07, 2024

Copy link to clipboard

Copied

I am trying to extract data from Layers within a PDF drawing into some usable format, JSON, XML etc. I am working on CAD exports of drawings in PDF format. I am able to interrogate the various Layers within PDF XChange Editor, each of which relates to a Reference within the CAD model (Building Outline, Fence, Hedge etc.). I want to be able to extract all of the metadata from each of the Layers and display this in JSON/XML format (Fill Color, Opacity, Stroke Color, Stroke Opacity, Border Width etc.). Please can you let me know if this is possible with the Adobe PDF Extract API or any other API / services? Thanks.

TOPICS
General , How to , PDF Extract API , Python SDK , REST APIs

Views

528

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 2 Correct answers

Community Expert , Feb 07, 2024 Feb 07, 2024

The current version of Extract doesn't provide any information about layers at all and we only provide the properties of vectors when they are used to represent table cells.

 

You'll need a PDF library tool to do this and even then, you'll need to know a lot about the PDF drawing instructions to get the information you need. It's a non-trivial task.

Votes

Translate

Translate
Community Beginner , Feb 15, 2024 Feb 15, 2024

For anyone grappling with the same issue there is a very informative post on stack overflow linked below. I would also recommend looking into the python library PyMuPDF.

python - Extract Geometry Elements from PDF by OCG (by Layer) - Stack Overflow 

Votes

Translate

Translate
Adobe Employee ,
Feb 07, 2024 Feb 07, 2024

Copy link to clipboard

Copied

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 08, 2024 Feb 08, 2024

Copy link to clipboard

Copied

Thanks for your response, this tool does not provide the required granularity however.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 07, 2024 Feb 07, 2024

Copy link to clipboard

Copied

The current version of Extract doesn't provide any information about layers at all and we only provide the properties of vectors when they are used to represent table cells.

 

You'll need a PDF library tool to do this and even then, you'll need to know a lot about the PDF drawing instructions to get the information you need. It's a non-trivial task.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 08, 2024 Feb 08, 2024

Copy link to clipboard

Copied

Thanks for your response.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Feb 15, 2024 Feb 15, 2024

Copy link to clipboard

Copied

LATEST

For anyone grappling with the same issue there is a very informative post on stack overflow linked below. I would also recommend looking into the python library PyMuPDF.

python - Extract Geometry Elements from PDF by OCG (by Layer) - Stack Overflow 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources