Copy link to clipboard
Copied
The error is:
XML Parsing Error: not well-formed
This is being caused by a hidden/invisible character in a some text that is coming from a database text field when creating an XML file.
The problem text/white space character is whatever is between the "d" and the word "shell" in the following text: "d shell d"
On this forum it has copied in as a visible space but when it is viewed in the Firefox page source or in the Firefox error message it does not appear and the text looks like this: "dshelld" BUT it still takes 2 cursor movements to go between the "d" and the "s" at the start and the "l" and the "d" at the end of the word so whatever it is is still there but hidden.
Is there any way I can find out what this white space actually is? It does not show up when I display hidden characters in DW, MS Word, or Open Office Write.
It is not recognised as a space i.e. " " with replace() but it is removed by: reReplaceNoCase(fieldReturn, "[[:space:]]", "","all"); unfortunately so is other essential white space.
So it is something in the character set "[[:space:]]" but what?
Copy link to clipboard
Copied
<cfset aString= "Something I want to investigate">
<cfoutput>
<cfloop from="1" to="#len(aString)#" index="char">
#mid(aString,char,1)#=#asc(mid(aString,char,1))#<br>
</cfloop>
</cfoutput>
This will output the code of each character. See what the code is and work from there.
Copy link to clipboard
Copied
So it is something in the character set "[[:space:]]" but what?
Could be one or more of tab, line-feed or carriage-return, that is, one or more of chr(9), chr(10) or chr(13). Browsers usually ignore them, as they're not HTML mark-up. See for yourself.
<cfset testString= "d
shell
d">
<cfoutput>#testString#</cfoutput>
Copy link to clipboard
Copied
Thanks for your suggestions and the the code from Ian will help in other situations but I found the answer to this after receiving the following error:
An invalid XML character (Unicode: 0x1c)
This led me to this post and the code below:
http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:51172#274704
<cfset newxml = rereplace(oldxml, "[\x00-\x1f]", " ", "All")>
Then by working through all the Unicode control codes. I eventually found the problem codes as being: x1c, x1d, x0c and I replaced them with this:
fieldReturn = rereplace(fieldReturn, "\x1c", " ", "All");
... etc.
Copy link to clipboard
Copied
Ian's test would have picked them up, too.