• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Word 365 to Acrobat: PDFMaker generates wrong tags

Explorer ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

I used the Save As Adobe PDF button in Word 365 to convert a file to Acrobat. When I open the PDF in Acrobat DC Pro, the tags are incorrect.

 

The first item in the TOC is tagged <TOCI> in the Tags Pane, but the rest have heading tags (e.g., H1, H2, H3). In the Page View, the tags say <Reference>. (The items with heading tags are not inside the <TOC> tag.) 

 

Random paragraphs are tagged as <TOCI> in the Tags pane when they should be <P>; a few headings also show the <TOCI>. However, in the Page View, they show as <P> (or the correct heading tag). 

 

Some figure captions are tagged <H2> in the Tags Pane when they normally come across as <P>. In the Page View, they show <P>. 

 

Figures come across tagged as <P> in both the Tags Pane and the Page View. When I try to change them to <Figure> (either by selecting and clicking Figure in the TURO, or by changing directly in the Tags Pane), Acrobat DC won't change them. 

 

What's going on, and how do I correct it? 

TOPICS
Create PDFs , Edit and convert PDFs , Standards and accessibility

Views

2.7K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

When you save as PDF, are you using the function built into Word, or are you using Acrobat's PDFMaker (the Acrobat ribbon in Word)? If it's Word, then unfortunately, this is not the right place to get answers, you will have to talk to Microsoft. If it's Acrobat, then we need to dig further to see what's going on. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

Fair question. I tried it both ways, with the same result. 

 

Guy

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

That's interesting, and points to a problem in Word, and not in the PDF generation. The two different PDF generators you've tried don't share any code, so if both come up with the same tag tree, then the problem is with the information that the Word document provides when the file is exported to PDF. Unless somebody here can help you with a Word problem, I would suggest that you ask this question in a forum that's about MS Word. I am not familiar enough with all details regarding tagging in Word - all I know is that the outline level gets used to determine what tags to use. My more in depth tagging experience is limited to Adobe InDesign. As far as general troubleshooting goes, I would check to see if this happens with all documents, or just with one or a small number. If not all documents are affected,  I would look into recreating at least part of the document from scratch, including recreating paragraph styles, to see if something in the document got messed up. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

Ah, I see you now have Bevi's attention, she is the expert when it comes to accessiblity. 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

Well, I had to "like" that comment, Karl! (Who's no chump change, himself.)

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

<QUOTE> "I used the Save As Adobe PDF button in Word 365 to convert a file to Acrobat. When I open the PDF in Acrobat DC Pro, the tags are incorrect."

 

I'm assuming you're on a Windows computer, and that this command was under the File menu.

 

Can you try making a PDF using the Acrobat Ribbon?

First, check the Preferences in the ribbon and make sure your accessibility settings are correct.

Export Preferences for Accessible tagged PDF from MS Word / Windows.Export Preferences for Accessible tagged PDF from MS Word / Windows.

 

Then, Create PDF.

Acrobat PDF Maker ribbon in MS Word / Windows.Acrobat PDF Maker ribbon in MS Word / Windows.

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

@inkedguy, after checking your export settings, drill down into your Word file and ensure these were done correctly. Given the hodgepodge of errors you're having, I'm suspecting it might be due to how your Word document was formatted.

 

One key rule: paragraph formatting styles must be applied to all of your text. The styles trigger the correct tags in the exported PDF.

 

Check these items in Word:

 

  1. Word's TOC utility was used to create the TOC, and was not made by hand.  References ribbon/tab | Table of Contents icon. For now, choose one of the defaults: Automatic Table 1 or Automatic Table 2.
  2. Check which paragraph formatting styles were applied to your headings (and other text, if you have time).
    Open the Styles Pane, and from it, Open the Styles Inspector. As you click inside each heading or paragraph of text, verify that the style Heading 1 was applied to the heading you want to be tagged <H1> in the PDF. Similar for the remaining headings, Heading 2 style = <H2>, Heading 3 style = <H3>, you get the pattern!
  3. You might need to clear out any residual formatting on those paragraphs in order to get the tags to come out correctly. If that's the case, select the paragraph of text, and click the Clear All formatting button from either the Styles Pane or the Styles Inspector. Then reapply the correct paragraph style to the paragraph.

 

Notes: In order to generate a TOC with the correct tags <TOC> | <TOCI> plus the accessible links tags, you must:

  • Use the correct heading paragraph styles to format your document's headings,
  • Use Word's TOC utility to generate the TOC, and
  • Don't manually edit the TOC after it is created. It's a generated part of the file and you don't want to mess with it.

 

In order to generate the correct heading tags in a PDF <H1>, <H2>, etc., you must use the corresponding Heading 1, Heading 2, etc. paragraph formatting styles to format the heading paragraphs. There are no exceptions.

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

Hello again, Bevi. 

 

Yes, it's a Windows computer (government issue). I ended up with the same result the second time around, using the Acrobat tab in the ribbon. 

 

I've been seeing the heading tags in the TOC all along (like for the past year, both on my old laptop and this new one). The random TOCI tags are new. Does it matter that it says <TOCI> in the Tags Pane, but shows <P> in the Page View? Which one is the 'real' tag? 

 

This is not the first file where I couldn't get the <Figure> tag to stick; some of the others came from other people, and I don't know how they generated their files. I just don't know why it happens. 

 

For the record, I am using styles in Word 365, with all heading style applied correctly where needed. The TOC is generated, not hand-typed, using TOC styles. 

 

Karl may be right that it's a Word problem, but I'm certainly all ears to hear what you think, too. 

 

Guy

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 31, 2020 Aug 31, 2020

Copy link to clipboard

Copied

Most likely there's something wrong with your Word file. Word itself is fairly stable and accurate (albeit, not perfect, and you could have either an outdated version or a corrumpted version). We've been govt consultants for decades and have found the biggest culprits are:

  • User error (lack of knowlege of how to correctly construct a Word document for accessibility).
  • Code crud or outdated formatting left in the document from previous versions of Word software. This crud prevents the PDF-export utility from interpreting the document and tagging it correctly. More later about this...
  • Outdated software, either Word or PDF Maker.

 

How about a test to help diagnose where the problem lies? With the file, or with Word, or even with PDF Maker?

 

If possible, download these test Word and matching PDFs from our students' resource website: https://www.pubcom.com/testfiles/ 

 

  1. Open the first base.docx file in your version of Word and export it to PDF.
    1. Did you get the correct heading tags?
  2. Then, add a TOC to it with Word's TOC utility, and export this new version to PDF.
    1. Did you get the correct TOC/TOCI tags?

 

QUOTE: "Does it matter that it says <TOCI> in the Tags Pane, but shows <P> in the Page View? Which one is the 'real' tag?"

Page View  = Thumbnails/Pages pane, and I don't think that's what you mean.

Do you mean the Order panel (aka, architectural/construction order, or Z-order)?

 

Only the Tags Tree is required to meet PDF/UA-1 compliance, and its tags and reading order are primary.

 

The tags you see in the Arch/Const Order are usually not correct. It's an Acrobat bug, but since the Tags Tree supercedes everything for accessibility, it doesn't affect your compliance. Recommend that you change the options in the Order panel to show the numbered order rather than the tags.

 

Note, you do want to ensure that the Arch/Const Order has a decent reading order because many assistive technologies, as well as commonly used tech, uses it rather than the Tags Tree. But this is not required for PDF/UA-1 compliance, just a really smart best practice to ensure your government documents don't leave anyone out of the loop. See our recent blog about this at The 4 Reading Orders in Accessible PDFs 

Tags Tree from base Word.docx exported to PDF.Tags Tree from base Word.docx exported to PDF.

 

Order (Architectural/Construction Order) panel.Order (Architectural/Construction Order) panel.

 

Let us know what you find out with the test above.

 

Also open the  PDF's Properties panel and see which version of PDF Maker your system used.

File | Properties | Description Tab | PDF Producer.

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 01, 2020 Sep 01, 2020

Copy link to clipboard

Copied

Good morning, Bevi.

 

As previously stated, the heading styles (Heading 1 through Heading 5) were used appropriately and the TOC was generated, not hand-typed. The TOC was Custom, not Automatic 1, so I deleted it and regenerated it using Automatic 1. Then I generated a new PDF. The new PDF still had the random TOCI tags, as well as <H2> and <H3> tags in the table of contents, rather than <TOCI> tags. 

 

This is where it gets interesting. I decided to generate the PDF again, but left out the TOC pages. This time, the PDF didn't have the random TOCI tags in the text! However, it did still have have the problem where a figure has a <P> tag on it, and won't accept being changed to <Figure>; it just stays <P>.

 

RE: tags in the tags pane vs. tags in the page view... I was sure I was calling it the wrong thing. I meant the actual view of the page, when you have the TURO tool open and can see the tags next to each paragraph. See screen shot.

 

Tags in Tags Pane vs. tags displayed on paragraphTags in Tags Pane vs. tags displayed on paragraph

 

The tags in the Tags Pane say one thing, but the tags on the paragraph say something else. That's why I wondered which one was "real."

 

I will download those sample files from your link and see what happens when I PDF them. Will get back to you...

 

Guy Ivie

Technical Writer

US Army Corps of Engineers

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Sep 01, 2020 Sep 01, 2020

Copy link to clipboard

Copied

Latest update: I created PDFs from the test files. The TOCs look fine. However, the order of tags in paragraphs that contained links didn't match. See screen shot:

Tags in your PDF vs. tags in mineTags in your PDF vs. tags in mine

 

Then it occurred to me that I hadn't checked the Acrobat preferences in Word. I found only one difference there: mine had Enable Advanced Tagging checked. I unchecked it, re-generated the PDF, and voila! The tags looked the same as yours.

 

I also inserted a graphic into the test Word file, adding alt text and a caption. In the PDF, the figure had a <Figure> tag, although it was inside a <P> tag. (This was after changing the preferences.)

 

I regenerated the PDF from my work file, with the correct preferences set in the Acrobat tab. The random <TOCI> tags were gone. Figures appeared the same as the one I inserted into your test file ( image file inside a <Figure> tag, which was inside a <P> tag). And I got no argument when changing the <Figure> tag to something else and back again.

 

But still getting heading tags instead of <TOCI> tags. See screenshot:

 

<H> tags instead of <TOCI> tags.<H> tags instead of <TOCI> tags.

 

This was a freshly generated TOC, using the Automatic 2 TOC style. When I look at the heading styles, the inspector shows the heading style in the Paragraph Formatting box, plus <none> in the box below it, Default Paragraph in the Text Level Formatting box, and plus <none> in the box below that. In the generated TOC, those boxes show TOC 1 (or 2 or 3), plus <none>, Hyperlink, and plus <none>.

 

I am... baffled.

 

Guy Ivie

Technical Writer

US Army Corps of Engineers

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 09, 2022 Dec 09, 2022

Copy link to clipboard

Copied

@Guy_I It's been over two years since you started this thread, but thank you. Perhaps my follow-up will also help someone out there.

 

I could not, for the life of me, figure out why heading and other basic tags were not exporting properly from Word to PDF when I had used heading styles to format the headings in Word. I had also used the Create PDF function using the Acrobat ribbon, as @Bevi Chagnon - PubCom.com indicated above.

 

After running some tests, I discovered the issue only occurred when there was a TOC in my document (one that was generated with Word's TOC utility). If I took out the TOC, the heading tags exported without issue. This was when I found your thread, and followed your cue to uncheck the "Enable advanced tagging" box in the Acrobat preferences. The heading tags exported properly after that with the TOC intact. There were still some oddities, but for the most part the proper heading and paragraph tags exported, which made a world of difference for a 150-page document.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 09, 2022 Dec 09, 2022

Copy link to clipboard

Copied

quote

This was when I found your thread, and followed your cue to uncheck the "Enable advanced tagging" box in the Acrobat preferences. The heading tags exported properly after that with the TOC intact.

By @Ya-Yin

 

Gah! That "advanced tagging" option is a disaster. Never check that option.

Glad you figured it out!

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents |
|    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 28, 2024 Aug 28, 2024

Copy link to clipboard

Copied

Four years later, and I finally figured out what was happening with the TOC items being tagged as headings! 

In the style definitions for the TOC levels in Word, there is a Paragraph setting where you can set the outline level of the TOC style. My TOC 1 was set to Body Text. My TOC 2 through TOC 5 were set to the corresponding heading level of 2 through 5... and those were the items in the TOC that came across to the PDF tagged as headings. Making their outline level Body Text made the problem go away, and all items in the TOC are tagged as <TOCI> in the PDF! Whew! 

 

Guy Ivie

Technical Writer

US Army Corps of Engineers

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Aug 28, 2024 Aug 28, 2024

Copy link to clipboard

Copied

LATEST

Almost forgot: I'm retiring from the Corps at the end of October, and switching careers to massage therapy. Beginning in November, future responses to this thread won't reach me. 

 

Guy Ivie

Technical Writer

US Army Corps of Engineers

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines