• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

urldecode codes euro sign as %u2AC instead of %u20AC

New Here ,
Aug 23, 2018 Aug 23, 2018

Copy link to clipboard

Copied

Hello!

This line:

msg = urldecode(msg,"utf-8");

Changes value of msg = %u20AC (euro sign) to %u2AC, what is a problem, because after that I can't uncode it on the javascript side.

unescape('%u20AC') = '€'

unescape('%u2AC') = '%u2AC'

instead of '%u20AC' so every time I expect euro sign to be showed I receive'%u2AC'.

Can I modify urldecode or use something different?

Views

423

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Aug 23, 2018 Aug 23, 2018

Copy link to clipboard

Copied

Can you please post your code as you run it?

I run

<cfset msg = "%u20AC">
<cfdump var="#urldecode(msg,"utf-8")#">
<cfoutput>#urldecode(msg,"utf-8")#</cfoutput>

and it looks good to me.

What ColdFusion version are you on?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 24, 2018 Aug 24, 2018

Copy link to clipboard

Copied

Mybie I am doing something wrong, I have this code insode :

strResult.append(msg);

msg = urldecode(msg,"utf-8");

strResult.append(msg);

and it reveals that some utf-8 characters are not decoded properly.

It is coldfusion 8.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 24, 2018 Aug 24, 2018

Copy link to clipboard

Copied

CF8 might be the problem.  CF is up to version 12 (CF2016), now, and CF2018 is on it's way.  You are way behind in your CF Server version.  You should upgrade to at least CF10 (later is better) just for security reasons, alone.

HTH,

^ _ ^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Aug 24, 2018 Aug 24, 2018

Copy link to clipboard

Copied

I found running installations of CF8 with JAVA 6 and CF9 with JAVA 7 and both show the same - wrong - behaviour.

No idea how to help, though.

 I mean: you can still write EUR or &eur;

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 24, 2018 Aug 24, 2018

Copy link to clipboard

Copied

Wolf, just to clarify one point you make: CF2018 came out last month:

https://www.carehart.org/blog/client/index.cfm/2018/7/19/whats_new_in_CF2018


/Charlie (troubleshooter, carehart.org)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 24, 2018 Aug 24, 2018

Copy link to clipboard

Copied

  I've been very busy, as of late, and didn't see the announcement.  (shrug)

V/r,

^ _ ^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 25, 2018 Aug 25, 2018

Copy link to clipboard

Copied

LATEST

Hi @wladekarek, does it help when you replace "%u20AC" with "%E2%82%AC"?

Test this

<cfscript>

msg="It costs %E2%82%AC 5.";

msg = urldecode(msg, "UTF-8");

writeoutput(msg);

</cfscript>

For details see, for example, the StackOverflow post on how to convert from Unicode to UTF-8 Hex. In your particular case, the steps are as follows:

Unicode value = 20AC

Binary value = 10000010101100

The Unicode value U+20AC is in the range 0x00000800 - 0x0000FFFF range (0x4E3E - 0xFFFF). So its Hex representation will be of the form:

   1110xxxx 10xxxxxx 10xxxxxx

where the x represents digits taken in order from the binary value, progressing from right to left. Starting therefore with the rightmost, and filling the 6 x positions with the digits, we get

10101100 (note: 101100 are the rightmost 6 digits of the binary value, 10000010101100)

Next, the representation in the middle. It is

10000010 (note: 000010 are the next 6 digits of the binary value, 10000010101100, going from right to left)

Lastly, the leftmost representation. It is

1110xx10 (note: 10 are the remaining digits of the binary value, 10000010101100, going from right to left).

This becomes 11100010, as the rules also say that we must replace any remaining x with 0.

The final Hex representation is therefore

11100010   10000010  10101100

Converting each binary back to Hex, we get

%E2%82%AC

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation