Skip to main content
Inspiring
June 12, 2008
Question

Coldfusion generate XML from recordset

  • June 12, 2008
  • 3 replies
  • 2413 views
As it says on the tin, I am trying to generate an XML document from my MYSQL recordset, but I keep getting errors saying that pound or amp have been referenced. The various characters are &, £, (R) and (TM).

I have read that this is because of XMLs usage of the & symbol.

Is there any way I can use this data in an XML document?

Any help appreciated.

Thanks,
Paul
    This topic has been closed for replies.

    3 replies

    BKBK
    Community Expert
    Community Expert
    June 19, 2008
    An ASCII table tells me 0x13 [=chr(19)] is the Device Control Three or DC3 character. It is not a printable character, but an instruction to a device. Much like 0xD [=chr(13)] stands for Carriage Return.

    Since it doesn't say anything to you, you probably don't need it. You can delete it from the XML content using code like

    <cfset xmlContent = replace(xmlContent,chr(19),"","all")>

    Incidentally, your DTD shouldn't contain the word item from my example unless, of course, your XML, too, contains a root-element of that name. For example, you started off with rss.



    BKBK
    Community Expert
    Community Expert
    June 17, 2008
    Simon's advice is solid. You strayed from it in two ways. Your code will work when you make the corrections.

    First, the doctype line should come directly after the XML declaration. Secondly, there is an extraneous > symbol at the end of the doctype line. An illustration follows


    Inspiring
    June 17, 2008
    Cheers for the pointer BKBK,

    This is my first real foray into XML, let alone XML from ColdFusion.

    I have amended the script as you suggested, but am now getting a message saying:
    An invalid XML character (Unicode: 0x13) was found in the element content of the document.

    I have gone through all my database fields and have made sure that any references to &, £ and (R) have been replaced with &, &pound; and &reg; respectively, I also added the following into my XML Doctype:

    <?xml version="1.0" encoding="UTF-8"?>
    <!DOCTYPE item [
    <!ENTITY pound "&#163;">
    <!ENTITY reg "&#174;">
    <!ENTITY amp "&#38;">
    ]>

    Can you give me a clue as to what this Unicode: 0x13 is? a Google search for it just seems to return lists of various quotes and accents.

    Thanks,
    Paul
    Inspiring
    June 12, 2008
    try this

    http://www.xmlnews.org/docs/xml-basics.html

    section 6. Text

    If you are working with 8-bit characters, you can usually type printing characters from the 7-bit (non-accented) US-ASCII character set directly into an XML document, except for the special characters “<” and “&”, and sometimes, “>” (it's best to escape it as well just to be safe). Whenever you need to include one of these three characters in the text of an XML document, simply escape it using an entity reference as described in the References section:

    <formula>x &lt; (x + 1)</formula>

    For “<”, use “&lt;”, for “&”, use “&”, and for “>”, use “&gt;”.

    Above character position 127, things become a little trickier on some systems, because by default XML uses UTF-8 for 8-bit character encoding rather than ISO-8859-1 (Latin Alphabet # 1), which HTML and many computer operating systems use by default. UTF-8 and ISO-8859-1 are both essentially identical with US-ASCII up to position 127; for higher characters (those with accents), UTF-8 uses multi-byte escape sequences.

    That means that in a UTF-8 XML document, you cannot simply use a single byte with decimal value 233 to represent “馣8221; (and there is no predefined &eacute; entity as there is in HTML); instead, you must either enter the UTF-8 multi-byte escape sequence, or use a special kind of XML reference called a character reference:

    <p>That is everyone's favourite caf&#233;.</p>

    When your text consists primarily of unaccented Roman characters, this is often the easiest way to escape the occasional accented or non-Roman character. Since “馣8221; appears at position 233 in Unicode (as in ISO-8859-1), the XML parser will read the string correctly as “That is everyone's favourite caf鮦#8221
    Inspiring
    June 12, 2008
    Thanks for your reply,

    I am not sure if I have misunderstood what you have said, or if you have misunderstood me. My database holds & and &pound; (not & and £), but when I try to put this into and XML document, it says:

    An error occured while Parsing an XML document.
    The entity "pound" was referenced, but not declared.

    Thanks,
    Paul
    Inspiring
    June 14, 2008
    Anyone?