Language property of tags is lost when merging Libreoffice generated PDF documents
- May 11, 2021
- 2 replies
- 4653 views
Hi,
My organization uses LibreOffice to create technically orientated multilingual documents. We have used Acrobat to merge these LibreOffice generated PDF documents as part of another PDF documents. Now we have found out that merging documents with Acrobat removes documents language tagging.
LibreOffice generates the language information as part of the document structure tagging, which is properly recognized by screen readers and it passes Acrobats accessibility check. However, after merging documents with Acrobat DC pro, all of the language information is stripped away from the document. Document structure tagging is otherwise preserved, but the language properties are removed in the process.
This problem does not happen when merging Word 2019 generated PDF documents. But as I understand, Word 2019 uses different technical approach and marks the language as part of the content stream. However, the LibreOffice way is completely valid in my understanding, with degards to the PDF standard, and you can even manually language tag text in Acrobat this way.
Is there any way to prevent the language properties in document tag structure from being stripped away while merging documents with Acrobat?
I have attached two example PDF files generated with LibreOffice and resulting Acrobat generated merged document.
