Copy link to clipboard
Copied
Hello!
This line:
msg = urldecode(msg,"utf-8");
Changes value of msg = %u20AC (euro sign) to %u2AC, what is a problem, because after that I can't uncode it on the javascript side.
unescape('%u20AC') = '€'
unescape('%u2AC') = '%u2AC'
instead of '%u20AC' so every time I expect euro sign to be showed I receive'%u2AC'.
Can I modify urldecode or use something different?
Copy link to clipboard
Copied
Can you please post your code as you run it?
I run
<cfset msg = "%u20AC">
<cfdump var="#urldecode(msg,"utf-8")#">
<cfoutput>#urldecode(msg,"utf-8")#</cfoutput>
and it looks good to me.
What ColdFusion version are you on?
Copy link to clipboard
Copied
Mybie I am doing something wrong, I have this code insode :
strResult.append(msg);
msg = urldecode(msg,"utf-8");
strResult.append(msg);
and it reveals that some utf-8 characters are not decoded properly.
It is coldfusion 8.
Copy link to clipboard
Copied
CF8 might be the problem. CF is up to version 12 (CF2016), now, and CF2018 is on it's way. You are way behind in your CF Server version. You should upgrade to at least CF10 (later is better) just for security reasons, alone.
HTH,
^ _ ^
Copy link to clipboard
Copied
I found running installations of CF8 with JAVA 6 and CF9 with JAVA 7 and both show the same - wrong - behaviour.
No idea how to help, though.
I mean: you can still write EUR or &eur;
Copy link to clipboard
Copied
Wolf, just to clarify one point you make: CF2018 came out last month:
https://www.carehart.org/blog/client/index.cfm/2018/7/19/whats_new_in_CF2018
Copy link to clipboard
Copied
I've been very busy, as of late, and didn't see the announcement. (shrug)
V/r,
^ _ ^
Copy link to clipboard
Copied
Hi @wladekarek, does it help when you replace "%u20AC" with "%E2%82%AC"?
Test this
<cfscript>
msg="It costs %E2%82%AC 5.";
msg = urldecode(msg, "UTF-8");
writeoutput(msg);
</cfscript>
For details see, for example, the StackOverflow post on how to convert from Unicode to UTF-8 Hex. In your particular case, the steps are as follows:
Unicode value = 20AC
Binary value = 10000010101100
The Unicode value U+20AC is in the range 0x00000800 - 0x0000FFFF range (0x4E3E - 0xFFFF). So its Hex representation will be of the form:
1110xxxx 10xxxxxx 10xxxxxx
where the x represents digits taken in order from the binary value, progressing from right to left. Starting therefore with the rightmost, and filling the 6 x positions with the digits, we get
10101100 (note: 101100 are the rightmost 6 digits of the binary value, 10000010101100)
Next, the representation in the middle. It is
10000010 (note: 000010 are the next 6 digits of the binary value, 10000010101100, going from right to left)
Lastly, the leftmost representation. It is
1110xx10 (note: 10 are the remaining digits of the binary value, 10000010101100, going from right to left).
This becomes 11100010, as the rules also say that we must replace any remaining x with 0.
The final Hex representation is therefore
11100010 10000010 10101100
Converting each binary back to Hex, we get
%E2%82%AC