Skip to main content
Participant
January 15, 2025
Question

How to preserve alternate text from the original PDF in the HTML document?

  • January 15, 2025
  • 2 replies
  • 538 views

I am exporting a tagged PDF to HTML using Adobe Acrobat, and the process works as expected. The resulting HTML stores the images from the document in a separate folder. However, the issue lies with the alt attribute in the <img> tags. All the <img> tags have the same alt text: alt="image".

 

 

 

<img width="718" height="403" alt="image" src="mytest_files/Image_001.png" />
<img width="718" height="403" alt="image" src="mytest_files/Image_002.png" />

 

 

 

Why doesn't the converted HTML retain the original alternate text from the PDF? Is there a way to preserve the alternate text in the exported document?

2 replies

AnandSri
Legend
January 30, 2025

Hello,

 

I hope you're doing well, and we apologize for the delayed response and the trouble.

 

Preserving alternate text (alt text) from a PDF when converting it to an HTML document can be challenging, as not all conversion tools retain this information. Alt text is crucial for accessibility, providing descriptions of images and other non-text elements for screen readers.

Steps to Preserve Alt Text During PDF to HTML Conversion:

  1. Ensure Proper Tagging in the Original PDF:

    • Before conversion, confirm that your PDF is correctly tagged and includes alt text for all relevant elements. Adobe Acrobat allows you to add and edit alt text using the Tags panel. For detailed instructions, refer to Adobe's guide on editing document structure with the Content and Tags panels.
  2. Use Adobe Acrobat's Export Feature:

    • Adobe Acrobat provides an export function that can convert PDFs to HTML while retaining tags and alt text. To do this:
      • Open your PDF in Adobe Acrobat.
      • Navigate to File > Export To > HTML Web Page.
      • In the export settings, ensure that the option to retain tags is selected.
    • For more information on exporting PDFs to other file formats, see Adobe's article on converting or exporting PDFs.
  3. Verify Alt Text in the HTML Output:

    • After conversion, review the HTML file to ensure that the alt text has been preserved. Open the HTML file in a text editor or web browser and check that the alt attributes are present for image tags.

Additional Considerations:

  • Limitations of Conversion Tools:

    • Be aware that not all PDF to HTML conversion tools retain alt text. Using Adobe Acrobat's built-in export feature increases the likelihood of preserving this information.
  • Manual Adjustments:

    • If the alt text is not preserved during conversion, you may need to manually add it to the HTML file. This involves editing the HTML code to include appropriate alt attributes for each image tag.

For more information, see this article: https://adobe.ly/4hllAuw

 

I hope this helps.

Thanks,

Anand Sri.

jane-e
Community Expert
Community Expert
January 15, 2025

@dpbhatt 

 

I've moved your post from Using the Community to the Acrobat forum.

 

Jane

 

jane-e
Community Expert
Community Expert
January 30, 2025

 

All the <img> tags have the same alt text: alt="image".

<img width="718" height="403" alt="image" src="mytest_files/Image_001.png" />
<img width="718" height="403" alt="image" src="mytest_files/Image_002.png" />
By @dpbhatt

 

 

In addition, alternative text should accurately describe the content or purpose of the image for individuals who cannot see it. Adding alt="image" is useless to those who need it.

 

Jane