Reading information from an XML file in InDesign

Report · Feb 28, 2024

I am trying to use an InDesign jsx file to read an xfdf file which has been exported from a form in Adobe Acrobat. I want to be able to find nodes by name attribute and return the value as a variable.

I am following the xpath syntax in XPath Syntax (w3schools.com) which seems quite simple.

I am able to get a response from xpath using wildcards ("//*) but as soon as I try to search for anything specific it returns nothing ("//field.value") or "//field[@name='File_1']/value".

I am sure I am close but cannot work out why these queries are coming back empty.

Below is a condensed version of my code with the XML data included as a variable for brevity.

At the end are various xpath commands and my comment on the outcome of eadch.

They can be un-commented to test.

try{

//content of the xfdf file is reproduced here:

var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>

<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"

><f href="Automation Form Test 1.pdf"

/><fields

><field name="File_1"

><value

>ABCDEF</value>

</field

><field name="File_2"

><value

>Some Text</value

></field

><field name="File_3"

><value

>Some more text</value

></field

></fields

><ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"

/></xfdf

>''';

xmlFile.encoding ="UTF-8";

var xmlData = (XML(xmlFile));

//below returns all node values OK:

alert(xmlData.xpath("//*));

//below returns a node value by index OK:

//alert(xmlData.xpath("//*[1]));

//this gets all nodes and attributes OK:

//alert(xmlData.xpath("//node()"));

//this should get the value from all fields but returns nothing:

//alert(xmlData.xpath("//field.value"));

//this should get the value of the first 'field' node but returns nothing:

//alert(xmlData.xpath("//field[1].value"));

//below should return ABCDEF but returns nothing:

//alert(xmlData.xpath("//field[@name='File_1']/value"));

//tried an alternative method but it returns undefined is not an object:

//var xmlStuff = xmlData.xmlElements[0];

//var getData = xmlStuff.evaluateXPathExpression("//*");

//alert(getData);

}catch(error){

alert("Error" + error + "\nScript stopped.");

exit();

}

Report · Feb 28, 2024

Shouldn't be:

//alert(xmlData.xpath("//field/value"));

https://www.w3schools.com/xml/xpath_syntax.asp

Report · Feb 29, 2024

Hi Robert,

Good pickup, yes I had accidentally used a period instead of forward slash in mt examples.

However I have tried both and still returned no value; see updated code below:

try{

var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>

<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"

><f href="Automation Form Test 1.pdf"

/><fields

><field name="File_1"

><value

>ABCDEF</value>

</field

><field name="File_2"

><value

>Some Text</value

></field

><field name="File_3"

><value

>Some more text</value

></field

></fields

><ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"

/></xfdf

>''';

xmlFile.encoding ="UTF-8";

var xmlData = (XML(xmlFile));

//below returns all node values OK:

//alert(xmlData.xpath("//*"));

//below returns a node value by index OK:

//alert(xmlData.xpath("//*[1]"));

//this gets all nodes and attributes OK:

//alert(xmlData.xpath("//node()"));

//this should get the value from all fields but returns nothing:

//alert(xmlData.xpath("//field/value"));

//this should get the value of the first 'field' node but returns nothing:

//alert(xmlData.xpath("//field[1]/value"));

//below should return ABCDEF but returns nothing:

//alert(xmlData.xpath("//field[@name='File_1']/value"));

//tried an alternative method but it returns undefined is not an object:

//var xmlStuff = xmlData.xmlElements[0];

//var getData = xmlStuff.evaluateXPathExpression("//*");

//alert(getData);

}catch(error){

alert("Error" + error + "\nScript stopped.");

exit();

}

Report · Feb 29, 2024

I'm sorry but I can't help you any further - I'm not JS guy.

I've just used Google to find some info - but not to the point that I could troubleshoot your code.

Report · Feb 29, 2024

Thanks, no problem Robert

Report · Feb 29, 2024

Hi @immijp49056642

Welcome to the club, you are entering the ExtendScript XML Nightmare 😉

I am not skilled enough in this area—Dirk and/or Loïc could probably help you, should they come across this discussion— but I strongly suspect that the biggest part of your problem is that your XML has all its elements namespaced due to the xmlns instruction at the root node:

<xfdf xmlns="http://ns.adobe.com/xfdf/" …>

This means that in expressions like xmlData.xpath("//field/value"), ‘field’ and ‘value’ are not the actual node names (only their local names I guess), but xpath is so poorly implemented in ExtendScript that you risk spending entire nights before finding a syntax (if any!) that could get around this difficulty.

Maybe the best option would be to remove the whole xmlns stuff before entering the XML constructor (it's easy to detect and remove it at the string level), then to re-insert that very namespace if you need to output some XML file at the end of your process (?)

There are certainly syntactic shortcuts to still access namespaced nodes, but I don't know them. If you can't bypass xmlns, the only solution I can see is the explicit and formal use of QName patches through the following scheme:

// Get the ns and create a dedicated QName function.
var ns = '' + myXML.namespace();
var f = function(nme){ return QName(ns,nme) };

// Then use f('nodeName') wherever needed.
// Boring? I know!

Now here are some tests you can run based on the above method:

// Content of your XML file
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="Automation Form Test 1.pdf"/>
<fields>
<field name="File_1"><value>ABCDEF</value></field>
<field name="File_2"><value>Some Text</value></field>
<field name="File_3"><value>Some more text</value></field>
</fields>
<ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"/>
</xfdf>'''
;

var xmlData = XML(xmlFile);

// Create your dedicated QName function.
var ns = ''+xmlData.namespace();
var f = function(nme){ return QName(ns,nme) };
var x;

// Get the <value> from all fields
x = xmlData.descendants( f('value') );
alert( x.toXMLString() );         // show nodes
alert( x.text().toXMLString() );  // show inner texts

// Get the <value> node of the 2nd <field> i.e. index 1.
x = xmlData.descendants( f('field') )[1].descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text().toXMLString() );  // show inner text

// Select the <field> whose @name is 'File_3' and gets its <value>.
x = xmlData.descendants( f('field') ).xpath("node()[@name='File_3']").descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text() );                // show inner text

No doubt much better approaches exist. Hopefully our colleagues will outline them below.

Best,

Marc

Report · Mar 01, 2024

Or if XML imports properly - just iterate through XMLElements collection of the document?

Report · Mar 03, 2024

Hi Marc,

Thank you for the detailed response. You have hit the nail on the head in your first paragraph - as soon as I deleted the namespace from the XML file (xmlns="http://ns.adobe.com/xfdf/"), I was able to get the expected responses to all of my queries. Because I don't need to output XML from this script I think I will simply remove the namespace as a first step in the script, but will keep your secondary suggestion on hand in case. Thanks again!

Report · Mar 03, 2024

Hi Marc,

I'm not sure if I responded to the you properly, here it is again just in case. Thank you for the detailed response. You have hit the nail on the head in your first paragraph - as soon as I deleted the namespace from the XML file (xmlns="http://ns.adobe.com/xfdf/"), I was able to get the expected responses to all of my queries. Because I don't need to output XML from this script I think I will simply remove the namespace as a first step in the script, but will keep your secondary suggestion on hand in case. Thanks again!

Report · Mar 03, 2024

Hello,

the namespace was the right hint. The elements here are in the default namespace. Alternatively, you can also set it:

setDefaultXMLNamespace(xmlData.name().uri);

and then reset it e.g. in finally:

} finally {
	setDefaultXMLNamespace("");
}

Roland

Report · Mar 05, 2024

Thanks Roland, this might be neater than replacing the namespace after reading in the XML.

Cheers

Adobe Community

Reading information from an XML file in InDesign

1 Correct answer