Skip to main content
New Participant
June 29, 2023
Question

Retrieve text and image alt-text for read-aloud feature

  • June 29, 2023
  • 1 reply
  • 1525 views

Hi guys

I am currently working on a project that requires the implementation of a read-aloud feature for PDF documents. The PDFs I'm dealing with include images with alt-text. My goal is to extract both the text and alt-text from the PDF while maintaining the correct reading order to enable the read-aloud functionality.

 
To accomplish this, I would appreciate your guidance on the following:
1. Extracting the text from the PDF while preserving the reading order that I've set using Adobe Acrobat.
2. Extracting the alt-text associated with the images in the PDF, also following the correct reading order.
3. Combining the extracted text and alt-text in the right order, which has been set using accessibily, to generate the content for a text-to-speech system.
 
Regards
This topic has been closed for replies.

1 reply

Brainiac
June 29, 2023

I may be wrong, but I believe Acrobat and Reader confirm to the system's accessibility APIs. https://opensource.adobe.com/dc-acrobat-sdk-docs/acrobatsdk/pdfs/acrobatsdk_access.pdf

New Participant
June 29, 2023

Hi Test Screen Name

Sorry, I am using pdfservice-sdk for Java. Could you please give me more detailed advice? 

Thanks 

Brainiac
June 29, 2023

No, I don't think so. That's not an Adobe product.