Skip to main content
_wckdTall_
Inspiring
March 3, 2025
Answered

Text Encoding changing from Japanese to gibberish

  • March 3, 2025
  • 2 replies
  • 1525 views

I have a script I'm running that changes localized type in swatches from Japanese to Gibberish. It works ok on when run by itself, but when run within other functions it's breaking. Is there an overall encoding for files that I should be using? Or a way to convert the string back so that it reads properly?
Example White:
ホワイト
Converts to:
ホワイト
When used as key in the alert, so I'm guessing it's not finding the swatch either.

        var swConvert = {
            "CMYK レッド": "CMYK Red",
            "CMYK イエロー": "CMYK Yellow",
            "CMYK グリーン": "CMYK Green",
            "CMYK シアン": "CMYK Cyan",
            "CMYK ブルー": "CMYK Blue",
            "CMYK マゼンタ": "CMYK Magenta",
            "ホワイト、ブラック": "White, Black",
            "オレンジ、イエロー": "Orange, Yellow",
            "色あせた空": "Fading Sky",
            "スーパーソフトブラックビネット": "Super Soft Black Vignette",
            "木の葉": "Foliage",
            "ポンパドール": "Pompadour",
            "ホワイト": "White",
            "ブラック": "Black",
            "[レジストレーション]": "[Registration]"
        }
        for (key in swConvert) {
            alert(key+" - "+swConvert[key]);
            try { aD.swatches[key].name = swConvert[key]; } catch (e) {};
        }

 

Correct answer _wckdTall_

Thanks @m1b !
At a global script level, $.appEncoding = 'UTF-8'; did not work. Placed directly inside the function, it fixes it!

I'm using Visual Studio Code, and tried to resave my JSX, but there is no option for encoding there. Is there a header I should have include on my file instead?

2 replies

Community Expert
March 5, 2025

I'm a bit confused by the Japanese in the script list. The Katagana characters: "ホワイト" are a sort of phonetic version of the English word White. Hiragana would be used to phonetically sound out Japanese words. The actual Japanese word for White is Shiro. Google's translate app shows a Kanji glyph: 白 (or the adjective Shiroi 白い).

_wckdTall_
Inspiring
March 5, 2025
That’s very insightful. The Kanji have been pulled from Illustrator files I've received internationally. Perhaps these are legacy localized terms? Or to your point of it being more phonetic, maybe just a “simplified” language. Doesn’t impact the script issue, but I’d be interested to find out.
Community Expert
March 5, 2025

I lived in Japan for three years when I was a kid (grew up a Marine Corps brat). I quickly learned the Katagana alphabet was used to sound out syllables of foreign language words or things like written sound effects in comic books. Schools in Japan put a big priority on students learning English. American music and movies have long been popular in Japan. Our Latin alphabet is used over there in all sorts of things, but people there tend to learn Kana and then Kanji before learning the Latin alphabet. A person who never picked up reading and writing English may only understand reading Japanese glyphs. So the Katagana character set is used to solve that problem. An English word like "white" is sounded out using "ホワイト" but those glyphs really say "ho-wa-i-to". Katagana has a limited overlap of syllables possible with the English language.

m1b
Community Expert
Community Expert
March 5, 2025

Hi @_wckdTall_ aside from ensuring the script file is encoded as UTF-8 (given that I can save and open it and the Japanese characters are fine I assume this is adequate), you can also add this to the start of your script and see if it helps:

$.appEncoding = 'UTF-8';

- Mark

_wckdTall_
_wckdTall_AuthorCorrect answer
Inspiring
March 5, 2025

Thanks @m1b !
At a global script level, $.appEncoding = 'UTF-8'; did not work. Placed directly inside the function, it fixes it!

I'm using Visual Studio Code, and tried to resave my JSX, but there is no option for encoding there. Is there a header I should have include on my file instead?

wckdtall
Inspiring
March 6, 2025

@_wckdTall_  That's interesting. Actually when I look at my code I do always use it inside a function, but I'm surprised it can't be set anywhere.

 

About VSCode, you just need to set the encoding on the file—easiest way is to click the bottom right corner where it shows the current encoding, and choose "Save with encoding" in the command bar at the top.

 

I'm pretty sure there was a case where I had to save a .csv file as "UTF-8 With Bom" for something, maybe to link to Indesign file? BOM is a few characters at the start of the file I think. I doubt you need to use it here, but I thought I'd mention it. 

- Mark

 

 


Thanks! I did find my doc is UTF-8, weird that it's not available during the save as, but more foolproof for accidentally nuking a file. I tried UTF-16 instead, and it doesn't support return characters the same way so I'm going to stick with "$.appEncoding".

Scope wise, the document I have all these functions in, I treat as kind of an engine for various functionality(4500 lines) cross programs, so it's very possible that some glitch is happening to switch encoding throughout. As it stands in Extendscript adding things via prototype like indexOf, tends to break things.