I'm using Canonicalize to check url's and have run into a problem. For some reason, &ne seems to always be translated as ≠ even in situations where that is not expected nor appropriate.
<cfset varURL = "www.mySite.com/myPage.cfm?someVar=abc&newVar=1" />
<cfset varCheck = Canonicalize(varURL,true,true)/>
The value of varCheck will be "www.mySite.com/myPage.cfm?someVar=abc≠wVar=1", not "www.mySite.com/myPage.cfm?someVar=abc&newVar=1" as I'm expecting.
It seems to make this translation anytime that a URL variable (other than the first one following the ?) starts with "ne".
Other than renaming the URL variables that start with "ne", is there a fix for this problem?
I'm running CF2016 Update 6.
Thanks, WolfShade, I had not reported it yet. Will do so.
Quite likely a bug indeed, the main reason being that canonicalize is evaluating the HTML entities even though they lack ";" at the end.
That said, I still cannot see the use-case for applying canonicalize to URLs. It implies the URL contains HTML entities.
With URLs the functions to use usually go in the other direction. That is, encodeForHTML, encodeForURL and URLEncodedFormat, which practically "uncanonicalize" the input.
Why are you using canonicalize on the URL anyway? There is apparently no reason for it. The function you need is encodeForHTML.
In any case, the behaviour you observe is not unexpected. With canonicalize, when ColdFusion sees &, it does its best to convert any HTML entity it finds. I would therefore expect it to convert such substrings as >, <, & and   respectively to >, <, & and the space character.
@BKBK, I'm not outputting it. I know to use EncodeFor when appropriate. I was using it as a first check to see if the URL was valid, but it was failing on what appeared to be valid URL's.