Question
Retrieve text and image alt-text for read-aloud feature
Hi guys
I am currently working on a java project that requires the implementation of a read-aloud feature for PDF documents. The PDFs I'm dealing with include images with alt-text. My goal is to extract both the text and alt-text from the PDF while maintaining the correct reading order to enable the read-aloud functionality.
To accomplish this, I would appreciate your guidance on the following:
1. Extracting the text from the PDF while preserving the reading order that I've set using Adobe Acrobat.
2. Extracting the alt-text associated with the images in the PDF, also following the correct reading order.
3. Combining the extracted text and alt-text in the right order, which has been set using accessibily, to generate the content for a text-to-speech system.
I am using pdfservice-sdk
Regards
