Copy link to clipboard
Copied
Acrobat Pro 2020.005.30467 (had an update this week)
Windows 10 Pro 19045.2728
Experience: Just a user; no programming.
I finalize reports that need to pass the basic 508 requirement (not for the web).
With these reports, I usually have to insert a scanned signature page (scanned by various people using various scanning equipment), and then OCR it so that I can apply tags. The signature page is prepared with a wet (not digital) signature. Most of the time after applying OCR, the signatures get tagged as figures with a description that they are approvals/signatures. No problem. No issues.
Some reports are required to have a Quality Control (QC) statement. These are usually sent as a digital Adobe Signed document that are made accessible after applying OCR.
Included in the reports are appendices of various documents (shipping docs, invoices, data sheets, other reports, etc.) that have to be scanned, OCRd, and made accessible. Sometimes these documents are sent electronically as pictures but then have to be OCRd to make accessible. All important information is tagged as text and not figures.
For some reason, Acrobat is frequently creating character encoding errors after the OCR or applying tags to the digitally signed QC statements. The majority of character encoding errors are with the signatures of the scanned pages.
In the past this happened occasionally with math equations or scientific formulas created in MS Word (Word to PDF), but now it's happening a lot! with signatures and other items.
I've tried all sorts of things: saving as a picture then back to a PDF to tag it; save as a PDF; save to a PDF using the PDF printer driver; popping it into Adobe Illustrator and back to a PDF. I tried using the preflight tool. I looked at fonts installed on my system. I can't even remember all of the things I have tried.
(I do not have Adobe PhotoShop.)
In many cases, the source document is the document being used and has to be OCRd.
The important question is Why all of the character encoding errors now when using Acrobat's OCR feature?
Thank you for any help or direction.
Copy link to clipboard
Copied
What do you mean by "character ending issue"? Do you mean that the OCR produces the wrong text? (Not a message). Acrobat can't OCR handwriting so it won't be able to accurately OCR handwritten signatures especially, which are often illegible even to humans.
Copy link to clipboard
Copied
Hi kga-rti-kga,
Thank you for reaching out and reporting this.
Please share the following information with us to investigate this issue:
- Did you start experiencing this after the recent update?
- Share the screenshot of the error message.
- Share the screen recording of the workflow for a better understanding.
It will help us to replicate the behavior at our end.
Thanks,
Meenakshi
Copy link to clipboard
Copied
Some background info...
The accessibility checkers are constantly being updated and, consequently, are finding more errors with our PDFs.
Which checker did you use? The accessibility checker built in Acrobat? Or a third-party checker like PAC or CommonLook? If so, which version of the checker?
Character encoding errors are usually caused by one of these situations:
I'm concerned that this is happening to OCR'd content. The original scan doesn't have any fonts at all because it's graphical text (printed or scanned in). The OCR of the scan uses Adobe's default built-in fonts to create the invisible OCR text (the live OCR'd text is hidden and we see only the original printed/scanned text, but assistive technologies access the live hidden text.
To the best of my knowledge, the only way you could have a character error with OCR'd text is when that special built-in Acrobat font is missing on your computer. Maybe it was deactivated or uninstalled?
Signatures usually are OCR'd as figures, not text, so there shouldn't be a text error at all unless the OCR utility found some text-like elements in it, like maybe a printed name underneath the wet signature.
It would help if you can post some screen captures of the text that is being flagged and the error message.
Copy link to clipboard
Copied
Character Encoding - Failed
Using the OCR built in Acrobat. Using the accessibility checker built in Acrobat.
No recent updates. Started happening in 2020-2022.
Yes. -- Signatures are OCR'd as figures, not text, Yes, there is a printed name underneath the wet signature, but that hasn't changed. I select the whole wet signature and printed signature and make it into a figure with alt text, but the character encoding failed error will often present itself.
I have no idea if any fonts have been deleted. I haven't deleted any. The error is intermittent, so that is odd too. I will get it on one PDF but not another. It doesn't matter how many times I redo (even with a fresh file) the PDF with the error, I still get the error.
I'm wondering if there is some kind of glitch happening because of some Windows conflict. IDK.
I'm unable to share the current document, because it is confidential.
Thank you for kind assistance.
Bevi, thank you for posting the Acrobat Accessibility Series | Adobe Document Cloud.
Copy link to clipboard
Copied
Below is an instance where the graphs were originated from Excel and pasted as pictures into Word. From the Accessibility Checker in Acrobat Pro 2020, you can see where the encoding error is picking up text from the bottom picture. I confirmed it is a picture, just like the graph at the top. Why? This page was converted from Word to PDF; no scanning or OCR.
Thank you.
Copy link to clipboard
Copied
Have there been any updates on this issue, or issue with encoding errors?
I'm trying to 508 a data sheet with wet signatures that was scanned. The company logo symbol, the chemical structure, and other items are generating encoding errors. I've tried editing the Acrobat file to make sure it is using fonts Arial or TNR. So frustrating. Acrobat Versions below.