Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
0

Text Encoding changing from Japanese to gibberish

Explorer ,
Mar 03, 2025 Mar 03, 2025

Copy link to clipboard

Copied

I have a script I'm running that changes localized type in swatches from Japanese to Gibberish. It works ok on when run by itself, but when run within other functions it's breaking. Is there an overall encoding for files that I should be using? Or a way to convert the string back so that it reads properly?
Example White:
ホワイト
Converts to:
ホワイト
When used as key in the alert, so I'm guessing it's not finding the swatch either.

        var swConvert = {
            "CMYK レッド": "CMYK Red",
            "CMYK イエロー": "CMYK Yellow",
            "CMYK グリーン": "CMYK Green",
            "CMYK シアン": "CMYK Cyan",
            "CMYK ブルー": "CMYK Blue",
            "CMYK マゼンタ": "CMYK Magenta",
            "ホワイト、ブラック": "White, Black",
            "オレンジ、イエロー": "Orange, Yellow",
            "色あせた空": "Fading Sky",
            "スーパーソフトブラックビネット": "Super Soft Black Vignette",
            "木の葉": "Foliage",
            "ポンパドール": "Pompadour",
            "ホワイト": "White",
            "ブラック": "Black",
            "[レジストレーション]": "[Registration]"
        }
        for (key in swConvert) {
            alert(key+" - "+swConvert[key]);
            try { aD.swatches[key].name = swConvert[key]; } catch (e) {};
        }

 

TOPICS
Bug , Scripting

Views

227
Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 2 Correct answers

Community Expert , Mar 04, 2025 Mar 04, 2025

Hi @wckdTall-2 aside from ensuring the script file is encoded as UTF-8 (given that I can save and open it and the Japanese characters are fine I assume this is adequate), you can also add this to the start of your script and see if it helps:

$.appEncoding = 'UTF-8';

- Mark

Votes

Translate
Explorer , Mar 05, 2025 Mar 05, 2025

Thanks @m1b !
At a global script level, $.appEncoding = 'UTF-8'; did not work. Placed directly inside the function, it fixes it!

I'm using Visual Studio Code, and tried to resave my JSX, but there is no option for encoding there. Is there a header I should have include on my file instead?

Votes

Translate
Community Expert ,
Mar 04, 2025 Mar 04, 2025

Copy link to clipboard

Copied

Hi @wckdTall-2 aside from ensuring the script file is encoded as UTF-8 (given that I can save and open it and the Japanese characters are fine I assume this is adequate), you can also add this to the start of your script and see if it helps:

$.appEncoding = 'UTF-8';

- Mark

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

Thanks @m1b !
At a global script level, $.appEncoding = 'UTF-8'; did not work. Placed directly inside the function, it fixes it!

I'm using Visual Studio Code, and tried to resave my JSX, but there is no option for encoding there. Is there a header I should have include on my file instead?

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

@wckdTall-2  That's interesting. Actually when I look at my code I do always use it inside a function, but I'm surprised it can't be set anywhere.

 

About VSCode, you just need to set the encoding on the file—easiest way is to click the bottom right corner where it shows the current encoding, and choose "Save with encoding" in the command bar at the top.

 

I'm pretty sure there was a case where I had to save a .csv file as "UTF-8 With Bom" for something, maybe to link to Indesign file? BOM is a few characters at the start of the file I think. I doubt you need to use it here, but I thought I'd mention it. 

- Mark

 

 

Screenshot 2025-03-06 at 09.32.23.pngexpand imageScreenshot 2025-03-06 at 09.32.32.pngexpand imageScreenshot 2025-03-06 at 09.32.47.pngexpand image

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

Thanks! I did find my doc is UTF-8, weird that it's not available during the save as, but more foolproof for accidentally nuking a file. I tried UTF-16 instead, and it doesn't support return characters the same way so I'm going to stick with "$.appEncoding".

Scope wise, the document I have all these functions in, I treat as kind of an engine for various functionality(4500 lines) cross programs, so it's very possible that some glitch is happening to switch encoding throughout. As it stands in Extendscript adding things via prototype like indexOf, tends to break things.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

@wckdtall I rarely see the benefit of overloading built-in methods. It is so easy to include an `indexOf` function that takes an array and another object, and the code is just as clear.

- Mark

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

LATEST

That's the route I ended up going on indexOf, just reformatted it as a function. I do like the "shorthand" of it, but the errors it created simply weren't worth it.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 04, 2025 Mar 04, 2025

Copy link to clipboard

Copied

I'm a bit confused by the Japanese in the script list. The Katagana characters: "ホワイト" are a sort of phonetic version of the English word White. Hiragana would be used to phonetically sound out Japanese words. The actual Japanese word for White is Shiro. Google's translate app shows a Kanji glyph: 白 (or the adjective Shiroi 白い).

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

That’s very insightful. The Kanji have been pulled from Illustrator files I've received internationally. Perhaps these are legacy localized terms? Or to your point of it being more phonetic, maybe just a “simplified” language. Doesn’t impact the script issue, but I’d be interested to find out.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

I lived in Japan for three years when I was a kid (grew up a Marine Corps brat). I quickly learned the Katagana alphabet was used to sound out syllables of foreign language words or things like written sound effects in comic books. Schools in Japan put a big priority on students learning English. American music and movies have long been popular in Japan. Our Latin alphabet is used over there in all sorts of things, but people there tend to learn Kana and then Kanji before learning the Latin alphabet. A person who never picked up reading and writing English may only understand reading Japanese glyphs. So the Katagana character set is used to solve that problem. An English word like "white" is sounded out using "ホワイト" but those glyphs really say "ho-wa-i-to". Katagana has a limited overlap of syllables possible with the English language.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

@Bobby Henderson I think this reflects the Japanese tendency to use katakana for borrowed or technical terms in modern contexts like software. I'm no expert, but use of the English phonetic spelling here might convey a sense of context where "shiro" might actually *feel* out of place? That last is just a speculation. Actually, we can ask an expert if not too busy: @sttk3 am I on the right track?

- Mark

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

It's more simple than that. In Japan if they want to write out something like an English word or an American name in Japanese they use Katagana to do so. For technical terms, if there is a Japanese word for it the word will be written in Kanji or Hiragana.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

Yes, it is the right thinking. Since we will frequently refer to English resources in the context of software, it is easier to cross-convert the meaning of words if we use katakana for the sounds.

 

If we were to use a Japanese word, it would be difficult to understand the meaning unless kanji is used for the parts of the word where kanji can be used, such as “白”.

 

Expressions like “shiro” are called “roma-ji” notation. Some people, especially old people, use this notation for variable names, etc., but it is generally regarded as a bad practice.

 

In addition, concepts used mainly in Japanese are defined in “roma-ji”. For example, “mojikumi” and “kinsoku” in Illustrator fall into this category.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

Thanks @sttk3 I found that very interesting. In English we incorporate words from other languages all the time (not least from Japanese!) but what you describe is something for which we have no analogue that I can imagine.

 

> Some people, especially old people, use this notation for variable names, etc., but it is generally regarded as a bad practice.

 

May I ask what is good practice nowadays? I assume just normal written Japanese for variable names?

- Mark

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

In the source code, English words are used as they are in English. This is because most programming languages are developed based on English.

 

If I were writing a programming language like なでしこ, which is based on Japanese, I would probably use Japanese words as they are.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Mar 05, 2025 Mar 05, 2025

Copy link to clipboard

Copied

Thanks again. There are benefits of standardization. Even I—a native English writer—have made changes to my language in the service of standardization. In Australia, we write "colour", but in code I always write "color" because it becomes too confusing using APIs that use "Color" and "ColorSpace", etc.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines