How to change text within a tag in a scanned document

Report · Nov 04, 2021

I have a example scanned document that I'm trying to make accessible. I can fix all the normal issues relating to accessibility however, Acrobat seems to create broken text within the tags which is read by a screen reader in testing. I'm assuming this is down to the font used which is confusing acrobat. I've tried right clicking on the tag, clicking properties and entering the correct text in the "Actual text" input but the end result is the same - the broken text is what is read by a screen reader. I know I could edit the text like any normal text but I don't have that font installed.

Is there no way to change the text within the tag without visually changing the text for the purposes of what a screen reader will read?

See attached screenshot for broken text in text VS visual text.

Thanks in advance!

Report · Nov 04, 2021

Hi,

Are you able to share the document as that would help us investigate the problem?

Report · Nov 05, 2021

Sure, please see an example of the issue in this test document attached.

Report · Nov 09, 2021

Hi,

I was able to make the changes that you where trying to make ( see attached to check I have done it correctly) , but while I was doing it I noticed that the "/actualText" entry was not always added to the tag. So I found that opening the tag using the Edit tag button, to check that was added and if not doing it again until it did seemed to fix the problem.

Report · Nov 09, 2021

Hi there, many thanks for looking however the issue is still present. Dispite that tag having the correct "actual text" set, it's still read incorrectly by screen readers.

For example:

1. the H1 of the page shows the content "<Ye Ofde <Printed (J)ocument" rather than the actual text (see my image in the original post)

2. When I read the PDF in a screen reader (tested in Mac VoiceOver and Windows NVDA) the heading is read out as "Less Ye Of de less printed bracket J ocument" rather than the value of the actual text input/property.

The same thin also happens with the H2 on the page, which is read by screen readers as 'Ye olae fistof things" dispite having the actual text set as "Ye olde list of things".

See attached screenshot showing Mac VoiceOver output when reading the H1. Setting Actual text doesn't seem to correct the misunderstood text.

I guess my question is when OCR mis-interpretates text, other than re-typing the text (a problem when you may not have the original font) is there no way to correct it for screen readers?

Report · Nov 09, 2021

Actually, this appears to be Mac VoiceOver bug as I've just tested your updated document again, taking care to make sure the actual text was actually added to the tag and it reads as expected in Windows NVDA and JAWS - as in it reads the actual text rather than the misunderstood scanned text.

However, on a MAc using VoiceOver to read the PDF in either Adobe Acrobat reader or Preview the actual text set is not read out and instead the misunderstood scanned text is read out.

Thanks for your help with this!

Report · Nov 10, 2021

HI,

Interesting as I am on OSX and the actual text is read for me, which OS X version are you testing on?

Report · Dec 01, 2021

10.14.6

Report · Dec 06, 2021

Hi,

I am on 11.6, I will see if I can get access to a 10.14.6 machine to test this on.