Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

CF8 - XmlFormat not escaping High ASCII characters

New Here ,
Apr 14, 2008 Apr 14, 2008
In CF8, we have a problem where XmlFormat is not escaping High ASCII characters. This was working just fine on our CF7 instance, but in CF8, it is not escaping all characters. I am aware of the long-standing problem with escaping Windows-1252 characters, but now we are experiencing an issue with basic high ASCII characters, like chr(233) and chr(244). Is anyone else experiencing this issue? We have not installed Update 1 to CF8 yet. I don't see a fix for this in the release note, but any word on if this is fixed by the updater?

Here is a test to demonstrate the issue:

<cfset myString = "The Islamic Republic of Mauritania's (République Islamique de Mauritanie) 2007 estimated population is 3,270,000. Cote d'Ivoire and Côte d'Ivoire">

<cfset myNewString = XmlFormat(myString)>

<cfoutput>#myNewString#</cfoutput>

1.2K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 18, 2008 Apr 18, 2008
Anyone else having this problem?
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Apr 18, 2008 Apr 18, 2008
Honestly not trying to be funny, but this is what XMLFormat is supposed to do. These get changed into their "escaped" characters:

<
>
'
"
&

And most painfully (the docs say - this problem kicked my butt before): "High ASCII characters in the range of 128-255".

Hope this helps. Why are you using XMLFormat, might I ask? Maybe if you can post what you are trying to do, someone can give you hand.

- Mike
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 19, 2008 Apr 19, 2008
Michael,

A bad choice of words on my part... I did not mean physically "remove" those characters, but in fact, escape them...

Since moving to CF8, we are finding that XmlFormat is not "escaping" all characters in the High ASCII range of 128-255.

Here is an example... if you run this in CF8, and if this is actually a bug, you should see that the é and the ô are not being escaped. I am just trying to find out if others are experiencing the same problem, or if in fact, this is a new bug..

<cfset myString = "The Islamic Republic of Mauritania's (République Islamique de Mauritanie) 2007 estimated population is 3,270,000. Also check Côte d'Ivoire">

<cfset myNewString = XmlFormat(myString)>

<cfoutput>#myNewString#</cfoutput>
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 20, 2008 Apr 20, 2008
Mike touched on the point. It has to do with a requirement for high ASCII characters in the range 128-255. In fact, the function xmlFormat does escape those characters in MX7 as well as in CF8, but there is a catch.

The documentation gives a clue when it says the character is "replaced by unicode escape sequence". You should therefore ensure that the page encoding is unicode. One way to do so is, for example

<cfprocessingdirective pageencoding="utf-8">

<cfset myString = "The Islamic Republic of Mauritania's (République Islamique de Mauritanie) 2007 estimated population is 3,270,000. Also check Côte d'Ivoire">

<cfset myNewString = XmlFormat(myString)>

<cfoutput>#myNewString#</cfoutput>

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 20, 2008 Apr 20, 2008
BKBK,

Thanks for the info. Adding the processingdirective does help show that these characters are being escaped, however, the behavior has changed somewhat between CF7 and CF8, as we were not using a processingdirective in CF7, and this was working as advertised.

Where this is giving us a problem is after we create an XML document using CFXML, (ensuring that we XmlFormat any strings), we then validate that document against a schema, and we are all of a sudden getting errors during validation for invalid characters within the XML. We are using ToString() after creating the XML document with CFXML, and our process is the same as we were using in CF7. That is why I was curious if anyone else was having this same issue... because something definitely changed between CF7 and CF8 with XML processing.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advisor ,
Apr 20, 2008 Apr 20, 2008
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 20, 2008 Apr 20, 2008
Milpool2000 wrote:
Adding the processingdirective does help show that these characters are being escaped, however, the behavior has changed somewhat between CF7 and CF8, as we were not using a processingdirective in CF7, and this was working as advertised.

I see your point. Both MX7 and CF8 are supposed to use UTF-8 encoding by default to return text.

We are using ToString() after creating the XML document with CFXML, and our process is the same as we were using in CF7.

Can't say. ToString() has something to do with encoding, encoding has something to do with Java version and MX7 uses Java 1.4 whereas Coldfusion 8 uses Java 6. Could there be something there?

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Apr 21, 2008 Apr 21, 2008
LATEST
Using ToString() does set the encoding of the XML file to UTF-8, but there could be some difference in the Java 6 processing of the XML... not sure. We installed CF8 Update 1 and the update to the JVM, but same issue still persists. We can work around it, but it is annoying at best.
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources