Skip to main content
Inspiring
April 16, 2008
Answered

problem with xmlParse()

  • April 16, 2008
  • 13 replies
  • 2380 views
I can not get around this problem... I have tried:
<cfset model = xmlParse(#cfhttp.fileContent#)>
<cfset model = xmlParse(cfhttp.fileContent)>
<cfset model = xmlParse(ToString(cfhttp.fileContent))>
<cfset model = xmlParse(Trim(cfhttp.fileContent))>
<cfset model = xmlParse(Trim(ToString(cfhttp.fileContent)))>

All of these give me the error : 'content is not allowed in prolog'...

What is the deal??? What is the syntax for this?
This topic has been closed for replies.
Correct answer Newsgroup_User
that was harder then it should have been, should have realized this from the git
go, sorry. its adding a BOM to the xml. its optional for UTF-8 & many s/w don't
add it or even strip it out. frankly i don't know that xml is supposed to have
this or not (and cf's xmlParse is doing the right thing).

in any case:

<cfset BOM=chr(65279)> <!--- utf-8 ONLY --->
<cfset z=replace(cfhttp.fileContent,BOM,"")>
<cfdump var="#xmlParse(z)#">

13 replies

BKBK
Community Expert
Community Expert
May 4, 2008
PaulH,

I am aware of the requirement of an XML parser to be able to determine two types of information from the BOM or even from the XML declaration. First, whether the storage type is big-endian or little-endian. Secondly, whether the document's encoding used 1, 2 or 4 bytes per character. From there, the parser can get a good idea of the type of encoding.

What made me think it would be difficult is the fact that XMLParse also has the function of validating XML by means of an external DTD. I can see the point you make about UTF-16. It would be easy to redesign XMLParse for XML in UTF-16 encoding. As you said, the BOM is a required feature for XML documents in UTF-16 encoding as well as for their DTDs. However, for most other encodings, there is no requirement for either BOM or XML declaration in the DTD.

Inspiring
April 19, 2008
BKBK wrote:
> Difficult. BOM is related to unicode, but xmlParse doesn't have any attribute
> that relates to encoding.

no not difficult and nothing to do w/the developer knowing the encoding beforehand.

if there's a BOM & the *xml* is declaring itself to be utf-8 encoded, the parser
can toss the BOM, it has no relevance for parsing the xml (only if the encoding
is utf-16 or i guess utf-32 is the BOM really relevant). if there's no xml
encoding declaration then the parser would have to look at & use the BOM to see
the encoding & which "endiness" to use.

if the parser can't handle BOMs (as far as i can tell from the W3C WG it's
supposed) then any valid utf-16, etc encoded xml docs will also bomb--the BOM is
*required* for those.
BKBK
Community Expert
Community Expert
April 19, 2008
The REREplaceNoCase line in my code removes the BOM as well, so it may seem like a duplication, coming after PaulH's code. There was a delay in the post from the newsgroup. When I composed my last post, PaulH's post of 04/17/2008 06:40:57 AM wasn't yet in the forum.

PaulH wrote:
after reading some W3C stuff on xml & BOMs i'm thinking
that maybe xmlParse() should actually handle this.


Difficult. BOM is related to unicode, but xmlParse doesn't have any attribute that relates to encoding.

Inspiring
April 17, 2008
djc11 wrote:
> Thanks Paul, I looked for a BOM after reading another post, but didn't see one
> so i didn't even try it :-P Works perfect now. Thank you again!

well you're not supposed to, but if you need to check write the content out
w/cffile & use iso-8859-1 as the "charset". if you open that file in notepad you
should see garbage (a question mark) before the 1st xml tag. if you use utf-8 as
the "charset" you won't see anything but notepad will want to save that file as
utf-8.

after reading some W3C stuff on xml & BOMs i'm thinking that maybe xmlParse()
should actually handle this.
djc11Author
Inspiring
April 17, 2008
Thanks Paul, I looked for a BOM after reading another post, but didn't see one so i didn't even try it :-P Works perfect now. Thank you again!

BKBK- I was making sure that i was calling xmlParse in the correct fashion. Thanks for you input though.
Newsgroup_UserCorrect answer
Inspiring
April 17, 2008
that was harder then it should have been, should have realized this from the git
go, sorry. its adding a BOM to the xml. its optional for UTF-8 & many s/w don't
add it or even strip it out. frankly i don't know that xml is supposed to have
this or not (and cf's xmlParse is doing the right thing).

in any case:

<cfset BOM=chr(65279)> <!--- utf-8 ONLY --->
<cfset z=replace(cfhttp.fileContent,BOM,"")>
<cfdump var="#xmlParse(z)#">
djc11Author
Inspiring
April 16, 2008
Here is a url: https://midas.jnet.us/dc/cf_aim/testarb.cfm
It is CF8.
Here is the source:
<cfxml variable="XMLSubscriptionRequest">
<?xml version="1.0" encoding="utf-8"?>
<ARBCreateSubscriptionRequest xmlns="AnetApi/xml/v1/schema/AnetApiSchema.xsd">
<merchantAuthentication>
<name>xasdfasdfpJD</name>
<transactionKey>xsdafdsafasdfjrx6</transactionKey>
</merchantAuthentication>
<refId>sample</refId>
<subscription>
<name>ProPlanner Subscription</name>
<paymentSchedule>
<interval>
<length>12</length>
<unit>months</unit>
</interval>
<startDate>2008-04-20</startDate>
<totalOccurrences>9999</totalOccurrences>
</paymentSchedule>
<amount>500.00</amount>
<payment>
<creditCard>
<cardNumber>4111111111111111</cardNumber>
<expirationDate>2008-12</expirationDate>
</creditCard>
</payment>
<customer>
<id>test0001</id>
<email>test@abc.com</email>
<phoneNumber>555-555-5555</phoneNumber>
</customer>
<billTo>
<firstName>Dustin</firstName>
<lastName>Chesterman</lastName>
<company>CFOTools.Net</company>
<address>123 Main St</address>
<city>Anytown</city>
<state>CA</state>
<zip>12345</zip>
</billTo>
</subscription>
</ARBCreateSubscriptionRequest>
</cfxml>

<cfhttp method="POST" url="https://apitest.authorize.net/xml/v1/request.api" >
<cfhttpparam type="Header" name="Accept-Encoding" value="deflate;q=0">
<cfhttpparam type="Header" name="TE" value="deflate;q=0">
<cfhttpparam name="body" type="xml" value="#ToString(XMLSubscriptionRequest)#" />
</cfhttp>

<cfdump var="#cfhttp#">
<cfset model = '#ToString(Trim(cfhttp.fileContent))#'>
<cfdump var="#model#">
<br />
<br />
<cfoutput>#ToString(model)#</cfoutput>
<cfset model1 = XmlParse(model)>
<cfdump var="#model1#">
Inspiring
April 16, 2008
djc11 wrote:
> If i dump cfhttp.fileContent it outputs the above xml. If i
> <cfoutput>#cfhttp.fileContent#</cfoutput>, it only outputs the text in the
> element nodes. That's what i would expect it to do but does it have anything
> to do with my problem?

no, if you view source you should see the xml.

i guess there's something going on w/the cfhttp. what version of cf? is there a
url i can test?

Inspiring
April 16, 2008
djc11 wrote:
> I am having no luck with this. Do you see any problem with the sample XML above?

i thought i replied to that already but i guess the xml in the posting scrabbled
the email off to never-never land.

yes, that xml snippet got parsed fine but if there's any goop at the beginning
of the xml it won't parse,which is why i thought trim would work.
djc11Author
Inspiring
April 16, 2008
If i dump cfhttp.fileContent it outputs the above xml. If i <cfoutput>#cfhttp.fileContent#</cfoutput>, it only outputs the text in the element nodes. That's what i would expect it to do but does it have anything to do with my problem?