Highlighted

Creating InDesign Tagged Text via XSLT

Adobe Community Professional ,
Jun 02, 2020

Copy link to clipboard

Copied

Hi, I am on Windows and using Oxygen XML Editor to convert XML to tagged text. When I place the text in InDesign, I get the Text Import Options dialog box.

 

image.png

 

I am specifying UTF-8 and using <UNICODE-WIN> as the first line of the file. If I copy and paste the content into a "reverse-engineered" tagged text file, I get the correct Import dialog box. I would like to be able to do this without having to copy/paste every time.

 

If I view the files in a hex editor, there is definitely a difference:

Oxygen generated:

image.png

 

Exported from InDesign as Tagged Text:

image.png

Here is a partial view of my XSLT stylesheet. Any suggestions would be appreciated.

image.png

Most Valuable Participant
Correct answer by Jongware | Most Valuable Participant

That Tagged Text export from InDesign is not UTF-8, it is UTF-16.

I'm actually surprised you are able to re-import that file into InDesign even if you tell it this "is" UTF-8. InDesign must be smarter than that.

 

Set your XSLT to export UTF-16 should work. I use "encoding="utf-16le"" and it has worked fine for me on both Mac and Windows (never minding that "UNICODE-WIN" line at the top -- I just checked, I happened to have "UNICODE-MAC" and InDesign does not bat an eye even on Windows).

TOPICS
Scripting

Views

680

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Creating InDesign Tagged Text via XSLT

Adobe Community Professional ,
Jun 02, 2020

Copy link to clipboard

Copied

Hi, I am on Windows and using Oxygen XML Editor to convert XML to tagged text. When I place the text in InDesign, I get the Text Import Options dialog box.

 

image.png

 

I am specifying UTF-8 and using <UNICODE-WIN> as the first line of the file. If I copy and paste the content into a "reverse-engineered" tagged text file, I get the correct Import dialog box. I would like to be able to do this without having to copy/paste every time.

 

If I view the files in a hex editor, there is definitely a difference:

Oxygen generated:

image.png

 

Exported from InDesign as Tagged Text:

image.png

Here is a partial view of my XSLT stylesheet. Any suggestions would be appreciated.

image.png

Most Valuable Participant
Correct answer by Jongware | Most Valuable Participant

That Tagged Text export from InDesign is not UTF-8, it is UTF-16.

I'm actually surprised you are able to re-import that file into InDesign even if you tell it this "is" UTF-8. InDesign must be smarter than that.

 

Set your XSLT to export UTF-16 should work. I use "encoding="utf-16le"" and it has worked fine for me on both Mac and Windows (never minding that "UNICODE-WIN" line at the top -- I just checked, I happened to have "UNICODE-MAC" and InDesign does not bat an eye even on Windows).

TOPICS
Scripting

Views

681

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Jun 02, 2020 0
Most Valuable Participant ,
Jun 02, 2020

Copy link to clipboard

Copied

That Tagged Text export from InDesign is not UTF-8, it is UTF-16.

I'm actually surprised you are able to re-import that file into InDesign even if you tell it this "is" UTF-8. InDesign must be smarter than that.

 

Set your XSLT to export UTF-16 should work. I use "encoding="utf-16le"" and it has worked fine for me on both Mac and Windows (never minding that "UNICODE-WIN" line at the top -- I just checked, I happened to have "UNICODE-MAC" and InDesign does not bat an eye even on Windows).

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 02, 2020 1
Adobe Community Professional ,
Jun 02, 2020

Copy link to clipboard

Copied

Thank you! UTF-16LE and UNICODE-MAC is working for me. UNICODE-WIN doesn't work, but as long as I get the correct results, I am happy. I appreciate all of you InDesign wire-heads.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 02, 2020 0
Advisor ,
Jun 02, 2020

Copy link to clipboard

Copied

While you try UTF-8 in both UI and XSLT, that second hex dump appears to be UTF-16LE.

When matching it a look at the BOM may also help - e.g. why it is accepted even with your import options, but unfortunately you cropped out the first bytes from your screen shot.

Edit: After a second look, there is no BOM.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Jun 02, 2020 0