Copy link to clipboard
Copied
Hello,
I am trying to edit the tags of a very long document. While creating a header 2 <H2> I highlighted the header and clicked "Create tag from selection". The header content was created, but the words were oddly split up:
<H2>
Break
o
ut
Session 2 11:30 AM
-
12:20 PM
What type of error am I facing and how do I recombine the text/keep this from happening?
Copy link to clipboard
Copied
Wow. Your sample shows some very unusual tagging (and order in the Order panel) for what should be a simple, straightforward table.
That points to 1 or more issues that could cause the problems:
1. Poor formatting and construction techniques of the original Word document.
Often, the split yellow content containers (shown in your original screen capture above) are created when there is some underlying coding or formatting put into the Word file by the author. Manual formatting (rather than using Paragraph and Character Styles) is a main culpret.
2. Outdated software
I can see from the File / Properties / Description that the PDF was created from MS Word and the version of Adobe PDF Maker that converted it to PDF is 19.10.123 (called the PDF Producer).
Microsoft released a substantial update for Office in Sept 2021, and I would suggest installing it if you haven't done so already.
And Adobe released a major update a couple of weeks ago to PDF Maker that resolves a lot of table issues. PDF Maker is now at version 22.1.17 (March 2022), and I suggest updating Acrobat —which will also update the PDF Maker plug-in for MS Office at the same time.
3. How the table was constructed in MS Word before it was exported.
Something is definitely wrong with this table. Many <TD>/<TH> cells throughout aren't tagged inside their <Table> tag.
But at this time, splitting words into different content containers isn't an official accessibility issue (it will be in the future, however). Screen readers and text-to-voice technologies will still voice the word, but it might be chopped up or stilted with very slight pauses or mis-pronunciations of the word fragments. Most times the voicing is OK, and sometimes it's not so great.
Solution:
Of course, you want the maximum accessibility possible for your audience, so a suggestion is to reformat the original Word source file, and export a new PDF using the latest version of PDF Maker. That would clean up the fragmented yellow content containers as well as produce correctly structured and tagged tables.
As stated above, this document — even if it's long — is really made up of very simple tables. With just a small amount of effort, the original Word file can be corrected to produce a nearly flawless PDF that doesn't need much tweaking afterwards.
Fixing poorly made PDFs is a really bad, time-consuming, and costly workflow. Doing it right from the start is so much easier for everyone. Training and templates can help.
Copy link to clipboard
Copied
Hi,
Can you share teh document, or another one that shows the problem?
Copy link to clipboard
Copied
Copy link to clipboard
Copied
Wow. Your sample shows some very unusual tagging (and order in the Order panel) for what should be a simple, straightforward table.
That points to 1 or more issues that could cause the problems:
1. Poor formatting and construction techniques of the original Word document.
Often, the split yellow content containers (shown in your original screen capture above) are created when there is some underlying coding or formatting put into the Word file by the author. Manual formatting (rather than using Paragraph and Character Styles) is a main culpret.
2. Outdated software
I can see from the File / Properties / Description that the PDF was created from MS Word and the version of Adobe PDF Maker that converted it to PDF is 19.10.123 (called the PDF Producer).
Microsoft released a substantial update for Office in Sept 2021, and I would suggest installing it if you haven't done so already.
And Adobe released a major update a couple of weeks ago to PDF Maker that resolves a lot of table issues. PDF Maker is now at version 22.1.17 (March 2022), and I suggest updating Acrobat —which will also update the PDF Maker plug-in for MS Office at the same time.
3. How the table was constructed in MS Word before it was exported.
Something is definitely wrong with this table. Many <TD>/<TH> cells throughout aren't tagged inside their <Table> tag.
But at this time, splitting words into different content containers isn't an official accessibility issue (it will be in the future, however). Screen readers and text-to-voice technologies will still voice the word, but it might be chopped up or stilted with very slight pauses or mis-pronunciations of the word fragments. Most times the voicing is OK, and sometimes it's not so great.
Solution:
Of course, you want the maximum accessibility possible for your audience, so a suggestion is to reformat the original Word source file, and export a new PDF using the latest version of PDF Maker. That would clean up the fragmented yellow content containers as well as produce correctly structured and tagged tables.
As stated above, this document — even if it's long — is really made up of very simple tables. With just a small amount of effort, the original Word file can be corrected to produce a nearly flawless PDF that doesn't need much tweaking afterwards.
Fixing poorly made PDFs is a really bad, time-consuming, and costly workflow. Doing it right from the start is so much easier for everyone. Training and templates can help.
Copy link to clipboard
Copied
Thank you very much!
I was able to fix the table tags. Yes, this document had some weird formatting. I think part of the problem was that it was originally built in Google Docs, converted into a word document, and then a PDF so there were likely some issues prior to PDF conversion. I will look into updating my Adobe software as well.
I had to go back and redo many of the TD TH cells. Adobe kept either missing content, or identifying too many rows. I'll pay better attention to the word formatting in the future!
Copy link to clipboard
Copied
Gah!
Google Docs!
Google has 0% accessibility, so there's always some strange hidden coding in the file after it's converted to Word.docx. I think part of that particular problem is that a Google Doc usually has two or more authors and they all format it slightly differently.
Also, Google has only very rudimentary formatting styles, which is central for an accessible document.
In my Accessible Word classes, I teach what we do here at my shop —
— In Google Docs, apply the few paragraph formatting styles to the content.
— Bring the Google Doc into MS Word.
— Keep the Paragraph Formatting if possible.
— Strip the text of all manual formatting.
— Reformat correctly with Word Paragraph and Characters styles, correct the hyperlinks, make a TOC, and perform all the other tasks needed for an accessible Word document.
— And then export to a new, more accessible PDF.
And they say that Google Docs is such as time-saver. Ha ha ha ha ha!