• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
1

Can I programmatically add tag to a normal pdf and convert it to accessible pdf

New Here ,
Mar 12, 2021 Mar 12, 2021

Copy link to clipboard

Copied

Hi,

 

Iam working on pdf remediation. I have normal pdfs. Iam thinking to write a script to read a normal pdf and identify various contents like headers, sub headers, lists, forms, tables, images and then add tags to the pdf content accordingly and generate a tagged pdf which will pass adobe accessibility check. My idea is reduce manual tagging efforts (in adobe acrobat dc pro software) by atleast 60 to 70%. 

Are there sdks which support adding tags programmatically to a normal pdf?

 

Thanks in advance

TOPICS
Edit and convert PDFs , How to , Standards and accessibility

Views

1.1K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Mar 12, 2021 Mar 12, 2021

Copy link to clipboard

Copied

It's not impossible. However, it requires both C++ programming skills and a very deep knowledge of PDF internals: the graphics model, the text model and the tagging model, which all interact. If you have that (or the time to study) you can use the PDSEdit layer in a custom plug-in.

Bear in mind that identifying "headers, sub headers, lists, forms, tables" is all guesswork. These things are not marked in a different way, pre-tagging. A table is a mixture of lines and text which the human eye quickly recognises as having patterns that make it a table. If you are working with highly standardised documents this is much easier.

 

By the way, Adobe's accessibility checker is not considered the industry standard for good accessibility; if you go to this trouble you should probably aim higher.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 01, 2023 Jun 01, 2023

Copy link to clipboard

Copied

Why C++ specifically as programming language to interact with or build a PDF document? 

I interpret this question as asking, how technically is the document markup model supported by PDF format represented in that format? Has anyone actually published guidance here? We live in a world where PDF is the afterthought format to more robust data modeling logics. How can we enable those logics to port into a PDF friendly namespace programmatically.

 

this is just a personal opinion, but it is shocking that even Adobe hasn't made more transparent open source ways for programmers to enable their content generation tools to output PDF in a way that retains the structure and semantics of content (not just visual layout).

 

of course for reasons of access, but as developers, data engineers, even enthusiasts, we should be nagging the heck out of these technical gatekeepers. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 13, 2021 Mar 13, 2021

Copy link to clipboard

Copied

This function already exists in Acrobat Pro, there is no need to reinvent the wheel.

But as explained above, automatisms can do a lot of things but it's a human who has to polish the job.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jul 06, 2022 Jul 06, 2022

Copy link to clipboard

Copied

We make PDFs dynamically so it would defnitely be something I would love to be able to do. Doesn't sound like a reasonable option.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 06, 2023 Apr 06, 2023

Copy link to clipboard

Copied

Did you find any solutions?

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 01, 2023 Jun 01, 2023

Copy link to clipboard

Copied

LATEST

 problem with accepting Acrobat a solution is chiefly pay to play/optimize format. It also suggests that adobe is the only "vendor" who can build a UI/UX to enable such needed improvements to pdf format files. 

the initial question is a real developer's question. If we actually de-mystified the programmatic process for doing what has already been done (to your point about adobe's "overlay" method), we would institute real change to the quality of pdf as format for end users. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines