Highlighted

Issues with Data Merge and non-latin alphabets

New Here ,
Oct 06, 2020

Copy link to clipboard

Copied

I have a document containing of one paragraph of text in 26 languages, among which Russian, and Greek. I placed the unformatted text in a clean Excel file and saved it from there as a tab delimited txt file (I need it to be tab delimited, because of specific layout settings in InDesign.

The texts in Russian and Greek show without issues in Excel, and the font set I'm using in InDesign has the full range of characters for 26 languages. When I straight copy-paste the text in InDesign there are no issues.

When I import the tab delimited txt file in InDesign the error message tells me that the document contains characters that cannot be encoded, and in the preview all the text, except for the ones in non-latin alphabet (they appear in all underscores). The same happens with a number of (but not all) characters with accents on or under them.

 

I'd appreciate any tips or pointers to solve the issue, if anyone knows.
Thanks!

Most Valuable Participant
Correct answer by Test Screen Name | Most Valuable Participant

There is a lot of discussion here, but I think the point is a very simple one. SInce you need tab delimited UTF-16/UCS-2 from Excel, I suggest you save as "Unicode text (*.TXT)". This works for me.

TOPICS
How to, Import and export, Scripting, Type

Views

99

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Issues with Data Merge and non-latin alphabets

New Here ,
Oct 06, 2020

Copy link to clipboard

Copied

I have a document containing of one paragraph of text in 26 languages, among which Russian, and Greek. I placed the unformatted text in a clean Excel file and saved it from there as a tab delimited txt file (I need it to be tab delimited, because of specific layout settings in InDesign.

The texts in Russian and Greek show without issues in Excel, and the font set I'm using in InDesign has the full range of characters for 26 languages. When I straight copy-paste the text in InDesign there are no issues.

When I import the tab delimited txt file in InDesign the error message tells me that the document contains characters that cannot be encoded, and in the preview all the text, except for the ones in non-latin alphabet (they appear in all underscores). The same happens with a number of (but not all) characters with accents on or under them.

 

I'd appreciate any tips or pointers to solve the issue, if anyone knows.
Thanks!

Most Valuable Participant
Correct answer by Test Screen Name | Most Valuable Participant

There is a lot of discussion here, but I think the point is a very simple one. SInce you need tab delimited UTF-16/UCS-2 from Excel, I suggest you save as "Unicode text (*.TXT)". This works for me.

TOPICS
How to, Import and export, Scripting, Type

Views

100

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Oct 06, 2020 0
Explorer ,
Oct 06, 2020

Copy link to clipboard

Copied

hi,
if txt file is imported by .place method, its first line should be <UNICODE-WIN>, at least for Russian

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
New Here ,
Oct 06, 2020

Copy link to clipboard

Copied

The text file is placed by importing the data source (data merge). Each paragraph is called with tne <<reference>> code in the proper places in the document.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
Explorer ,
Oct 06, 2020

Copy link to clipboard

Copied

sorry, i missed excel conversation ))
try to save excel to unicode text, you will not lose tab delimiter,
also you can share your Excel file, maybe someone will test it

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
Participant ,
Oct 06, 2020

Copy link to clipboard

Copied

I don't know how Excel saves the text files, but my guess is you need UTF-16 encoded files to have the full range of characters.
In LibreOffice there's a setting in the Save as dialog..

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
Adobe Community Professional ,
Oct 06, 2020

Copy link to clipboard

Copied

Hi Arno,

UTF-16 or UCS-2 encoding of the data source is necessary, yes! Jens is definitely right on this.

But even with UTF-16 something could go wrong. Not because you are doing something wrong, but InDesign's data merge feature has a bug. Lately we had the strange case with a possible bug merging QR Codes that contain Czech characters like:

 

Ř

LATIN CAPITAL LETTER R WITH CARON

 

See this thread:

DataMerge with special characters and QR code
badbernburg, Sep 25, 2020
https://community.adobe.com/t5/indesign/datamerge-with-special-characters-and-qr-code/td-p/11460803

 

FWIW: Did not here anything new by the OP badbernburg.

Don't think this issue is resolved.

 

Regards,
Uwe Laubender

( ACP )

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
New Here ,
Oct 06, 2020

Copy link to clipboard

Copied

Thank you everyone for your response.
Regarding the UTF-16or UCS-2 encoding... I don't have that option when I save the file as tab delimited text, so I'm assuming it defaults to UTF-8.
Would any of you know an alternative way to save the file with which I have the full range of characters, but which also honours the column division in Excel, so that I can data merge each column in a place of my choosing?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
Adobe Community Professional ,
Oct 06, 2020

Copy link to clipboard

Copied

Hi Arno,

what's your operating system?

 

When on Mac OS X Apple's TextEdit app can save a text file to UTF-16.

When on Windows 10, Notepad++ can convert and save a text file to UCS-2 LE BOM.

 

Hm, even InDesign itself can export selected text to Unicode UTF-16.

From my German InDesign on Windows 10:

TextExportOptions-of-SelectedText-PC-Unicode-UTF16.PNG

 

Regards,
Uwe Laubender

( ACP )

 

 

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0
New Here ,
Oct 06, 2020

Copy link to clipboard

Copied

The problem is not in InDesign, I fear, nor in the text editor. I can set the text editor to UTF-16, but when I import the text into Excel, it converts back to UTF-8. It seems there is no option in the Save As panel, nor anywhere in the preferences that I can find, where I can set a default for UTF-16.
I need to save the text out of Excel as a tab delimited txt file (where the tabs represent each colum I made in Excel for the different paragraphs of information.

 

I'm on a Mac running Sierra, the latest CC and Excel 16.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Oct 06, 2020 0