Highlighted

urldecode codes euro sign as %u2AC instead of %u20AC

New Here ,
Aug 23, 2018

Copy link to clipboard

Copied

Hello!

This line:

msg = urldecode(msg,"utf-8");

Changes value of msg = %u20AC (euro sign) to %u2AC, what is a problem, because after that I can't uncode it on the javascript side.

unescape('%u20AC') = '€'

unescape('%u2AC') = '%u2AC'

instead of '%u20AC' so every time I expect euro sign to be showed I receive'%u2AC'.

Can I modify urldecode or use something different?

Views

266

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

urldecode codes euro sign as %u2AC instead of %u20AC

New Here ,
Aug 23, 2018

Copy link to clipboard

Copied

Hello!

This line:

msg = urldecode(msg,"utf-8");

Changes value of msg = %u20AC (euro sign) to %u2AC, what is a problem, because after that I can't uncode it on the javascript side.

unescape('%u20AC') = '€'

unescape('%u2AC') = '%u2AC'

instead of '%u20AC' so every time I expect euro sign to be showed I receive'%u2AC'.

Can I modify urldecode or use something different?

Views

267

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Aug 23, 2018 0
Participant ,
Aug 23, 2018

Copy link to clipboard

Copied

Can you please post your code as you run it?

I run

<cfset msg = "%u20AC">
<cfdump var="#urldecode(msg,"utf-8")#">
<cfoutput>#urldecode(msg,"utf-8")#</cfoutput>

and it looks good to me.

What ColdFusion version are you on?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 23, 2018 0
New Here ,
Aug 24, 2018

Copy link to clipboard

Copied

Mybie I am doing something wrong, I have this code insode :

strResult.append(msg);

msg = urldecode(msg,"utf-8");

strResult.append(msg);

and it reveals that some utf-8 characters are not decoded properly.

It is coldfusion 8.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 24, 2018 0
LEGEND ,
Aug 24, 2018

Copy link to clipboard

Copied

CF8 might be the problem.  CF is up to version 12 (CF2016), now, and CF2018 is on it's way.  You are way behind in your CF Server version.  You should upgrade to at least CF10 (later is better) just for security reasons, alone.

HTH,

^ _ ^

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 24, 2018 1
Participant ,
Aug 24, 2018

Copy link to clipboard

Copied

I found running installations of CF8 with JAVA 6 and CF9 with JAVA 7 and both show the same - wrong - behaviour.

No idea how to help, though.

 I mean: you can still write EUR or &eur;

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 24, 2018 1
Adobe Community Professional ,
Aug 24, 2018

Copy link to clipboard

Copied

Wolf, just to clarify one point you make: CF2018 came out last month:

https://www.carehart.org/blog/client/index.cfm/2018/7/19/whats_new_in_CF2018

/Charlie (server troubleshooter, carehart.org)

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 24, 2018 0
LEGEND ,
Aug 24, 2018

Copy link to clipboard

Copied

  I've been very busy, as of late, and didn't see the announcement.  (shrug)

V/r,

^ _ ^

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 24, 2018 0
BKBK LATEST
Adobe Community Professional ,
Aug 25, 2018

Copy link to clipboard

Copied

Hi @wladekarek, does it help when you replace "%u20AC" with "%E2%82%AC"?

Test this

<cfscript>

msg="It costs %E2%82%AC 5.";

msg = urldecode(msg, "UTF-8");

writeoutput(msg);

</cfscript>

For details see, for example, the StackOverflow post on how to convert from Unicode to UTF-8 Hex. In your particular case, the steps are as follows:

Unicode value = 20AC

Binary value = 10000010101100

The Unicode value U+20AC is in the range 0x00000800 - 0x0000FFFF range (0x4E3E - 0xFFFF). So its Hex representation will be of the form:

   1110xxxx 10xxxxxx 10xxxxxx

where the x represents digits taken in order from the binary value, progressing from right to left. Starting therefore with the rightmost, and filling the 6 x positions with the digits, we get

10101100 (note: 101100 are the rightmost 6 digits of the binary value, 10000010101100)

Next, the representation in the middle. It is

10000010 (note: 000010 are the next 6 digits of the binary value, 10000010101100, going from right to left)

Lastly, the leftmost representation. It is

1110xx10 (note: 10 are the remaining digits of the binary value, 10000010101100, going from right to left).

This becomes 11100010, as the rules also say that we must replace any remaining x with 0.

The final Hex representation is therefore

11100010   10000010  10101100

Converting each binary back to Hex, we get

%E2%82%AC

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Aug 25, 2018 0