Writing XML and preserve comments

Report · Dec 11, 2020

Dear all,

Now that I have manged to read all sorts of information from an XML file I want to go one step further and save created or modified data back to the XML file. While a function WriteXmlData (xData, sXmlFile) is quite simple and works for most situations (see hereafter), I encounter these problems:

An XML file is per defintion a human readeable file - why else would there be XML comments. When reading (parsing) the xml file comments ar stripped by default. How can I preserve the comments? I do not understand how to use the XML class properties (BTW the term class is very misleading for simple minds like me). In particular how do I use the property ignoreComments?
The following does not work at all:

function main () {
var oXmlSettings, xmlData;
  oXmlSettings = xmlData.defaultSettings(); // xml reader settings ?
  oXmlSettings.ignoreComments = false;      // are they keypt ?
  xmlData.setSettings(oXmlSettings);        // are these the new settings?

Also in chapter XML Object Reference of the JavaScript Tools Guide CC there is described how to use JS variables in XML data. While this works perfectly within the script, trying to write the data out to the file clears the file!

function main () {
var nItems; 
var oXmlSettings, xmlData, xmlData2, sXmlFile = "WriteXml.xml"; // in same dir as this script;
var bWord = 1, bCase = 0, bBack = 0, bClone = 0, sSearchMode = "RegeX",
     sFindType = "Conditional Tag →", sName = "Experiment with JS varaibles";
var xmlData2 =
  <item>
    <name>{sName}</name>
    <info>Elements and attributes set via JS variables.</info>
    <options searchMode={sSearchMode} word={bWord} case={bCase} back={bBack} />
    <findtype>{sFindType}</findtype>
    <findstring>*something</findstring>
    <replmode>To Text:</replmode>
    <replstring>$1\t</replstring>
  </item>

  xmlData = GetXMLdata (sXmlFile);   // already contains the root and two items
// This works perfectly, because xmlData1 does not use JS variables
  nItems = xmlData.item.length();
  xmlData.item[nItems] = xmlData1;
  WriteXML (xmlData, sXmlFile);
// This, however clears the file!
  $.bp(true);
  nItems = xmlData.item.length();        // n
  xmlData.item[nItems] = xmlData2;       // xmlData correctly appended
  nItems = xmlData.item.length();        // n+1
  WriteXML (xmlData, sXmlFile);          // file is cleared!!!!
}

Please find the complete script and xml data file attached. AAAArgh, ZIP can not be attached. So find it here: https://daube.ch/zz_tests/WriteXML.zip

Report · Dec 11, 2020

Well, for point 2 (cleared file) I have found the reason:

sFindType = "Conditional Tag →"

The arrow must be replaced by an entity:

sFindType = "Conditional Tag &#x2192;", sName = "Experiment with JS varaibles";

This, however writes to the file:

<findtype>Conditional Tag &amp;#x2192;</findtype>

IMHO to much of a substitution... It creaates errors, not on input but in further processing, because the gotten string is rubbish.

For correct input it is sufficient to write

<findtype>Conditional Tag &#x2192;</findtype>

or even this works perfectly on input. For what else do we have UTF-8 as default coding in XML?

<findtype>Conditional Tag →</findtype>

Only for this particular character I could use a named entity → - but this would create in output again something wreck: &rarr;

Report · Dec 11, 2020

Well, this seems to be a wasp's nest again:

Oxxyyd (https://forum.juce.com/t/xml-special-characters/11972/5) writes 0n 2013-12-02:

I would have expected the same result as ivanslo did, namely an xml file encoded with utf-8 with human readable characters, not charachter codes.

It doesn't really make sense to me to utilize a character coding as utf-8 that's mainly invented to allow storing and displaying all sorts of characters from non-english languages, and end up with a file that's hardly readable!

After all, it's hard to expect anyone to have a unicode translator at hand just to enter a file name like Größenmaßstäbe.txt in an xml-settings file. Or Möbelträgerfüße.doc. Let alone to be able to read such a file correctly.

And I, Klaus Daube, would also have expected this.

Report · Dec 11, 2020

The solution to problem #2 is given by Klaus Göbel in post https://community.adobe.com/t5/framemaker/xml-and-unicode-a-quarrelling-couple/td-p/11677704

For problem #1 i found the solution just by chance (again: documentation requires examples, not just theory):

oXmlSettings = XML.defaultSettings();         // xml reader settings
oXmlSettings.ignoreComments = false;
XML.setSettings(oXmlSettings);                // The new settings

This really keeps the comments on input and puts them out correctly again.

Writing XML and preserve comments

1 Correct answer

Photos