Skip to main content
immijp49056642
Participating Frequently
February 28, 2024
Answered

Reading information from an XML file in InDesign

  • February 28, 2024
  • 3 replies
  • 1619 views

I am trying to use an InDesign jsx file to read an xfdf file which has been exported from a form in Adobe Acrobat. I want to be able to find nodes by name attribute and return the value as a variable.

I am following the xpath syntax in XPath Syntax (w3schools.com) which seems quite simple.

I am able to get a response from xpath using wildcards ("//*) but as soon as I try to search for anything specific it returns nothing ("//field.value") or "//field[@name='File_1']/value".

I am sure I am close but cannot work out why these queries are coming back empty.

 

Below is a condensed version of my code with the XML data included as a variable for brevity.

At the end are various xpath commands and my comment on the outcome of eadch.

They can be un-commented to test.

 

try{
 
//content of the xfdf file is reproduced here:
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"
><f href="Automation Form Test 1.pdf"
/><fields
><field name="File_1"
><value
>ABCDEF</value>
</field
><field name="File_2"
><value
>Some Text</value
></field
><field name="File_3"
><value
>Some more text</value
></field
></fields
><ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"
/></xfdf
>''';
 
xmlFile.encoding ="UTF-8";
var xmlData = (XML(xmlFile));
 
//below returns all node values OK:
alert(xmlData.xpath("//*));
 
//below returns a node value by index OK:
//alert(xmlData.xpath("//*[1]));
 
//this gets all nodes and attributes OK:
//alert(xmlData.xpath("//node()"));
 
//this should get the value from all fields but returns nothing:
//alert(xmlData.xpath("//field.value"));
 
//this should get the value of the first 'field' node but returns nothing:
//alert(xmlData.xpath("//field[1].value"));
 
//below should return ABCDEF but returns nothing:
//alert(xmlData.xpath("//field[@name='File_1']/value"));
 
//tried an alternative method but it returns undefined is not an object:
//var xmlStuff = xmlData.xmlElements[0];
//var getData = xmlStuff.evaluateXPathExpression("//*");
//alert(getData);
 
}catch(error){
alert("Error" + error + "\nScript stopped.");
exit();
}
This topic has been closed for replies.
Correct answer Marc Autret

Hi @immijp49056642 

 

Welcome to the club, you are entering the ExtendScript XML Nightmare 😉

 

I am not skilled enough in this area—Dirk and/or Loïc could probably help you, should they come across this discussion— but I strongly suspect that the biggest part of your problem is that your XML has all its elements namespaced due to the xmlns instruction at the root node:

 

    <xfdf xmlns="http://ns.adobe.com/xfdf/" …>

 

This means that in expressions like xmlData.xpath("//field/value"), ‘field’ and ‘value’ are not the actual node names (only their local names I guess), but xpath is so poorly implemented in ExtendScript that you risk spending entire nights before finding a syntax (if any!) that could get around this difficulty.

 

Maybe the best option would be to remove the whole xmlns stuff before entering the XML constructor (it's easy to detect and remove it at the string level), then to re-insert that very namespace if you need to output some XML file at the end of your process (?)

 

There are certainly syntactic shortcuts to still access namespaced nodes, but I don't know them. If you can't bypass xmlns, the only solution I can see is the explicit and formal use of QName patches through the following scheme:

 

// Get the ns and create a dedicated QName function.
var ns = '' + myXML.namespace();
var f = function(nme){ return QName(ns,nme) };

// Then use f('nodeName') wherever needed.
// Boring? I know!

 

Now here are some tests you can run based on the above method:

 

// Content of your XML file
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="Automation Form Test 1.pdf"/>
<fields>
<field name="File_1"><value>ABCDEF</value></field>
<field name="File_2"><value>Some Text</value></field>
<field name="File_3"><value>Some more text</value></field>
</fields>
<ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"/>
</xfdf>'''
;

var xmlData = XML(xmlFile);

// Create your dedicated QName function.
var ns = ''+xmlData.namespace();
var f = function(nme){ return QName(ns,nme) };
var x;

// Get the <value> from all fields
x = xmlData.descendants( f('value') );
alert( x.toXMLString() );         // show nodes
alert( x.text().toXMLString() );  // show inner texts

// Get the <value> node of the 2nd <field> i.e. index 1.
x = xmlData.descendants( f('field') )[1].descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text().toXMLString() );  // show inner text

// Select the <field> whose @name is 'File_3' and gets its <value>.
x = xmlData.descendants( f('field') ).xpath("node()[@name='File_3']").descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text() );                // show inner text

 

No doubt much better approaches exist. Hopefully our colleagues will outline them below.

 

Best,

Marc

3 replies

Inspiring
March 3, 2024

Hello,

 

the namespace was the right hint. The elements here are in the default namespace. Alternatively, you can also set it:

 

setDefaultXMLNamespace(xmlData.name().uri);

 

and then reset it e.g. in finally:

 

} finally {
	setDefaultXMLNamespace("");
}

 

 
Roland
immijp49056642
Participating Frequently
March 6, 2024

Thanks Roland, this might be neater than replacing the namespace after reading in the XML.

Cheers

Marc Autret
Marc AutretCorrect answer
Brainiac
March 1, 2024

Hi @immijp49056642 

 

Welcome to the club, you are entering the ExtendScript XML Nightmare 😉

 

I am not skilled enough in this area—Dirk and/or Loïc could probably help you, should they come across this discussion— but I strongly suspect that the biggest part of your problem is that your XML has all its elements namespaced due to the xmlns instruction at the root node:

 

    <xfdf xmlns="http://ns.adobe.com/xfdf/" …>

 

This means that in expressions like xmlData.xpath("//field/value"), ‘field’ and ‘value’ are not the actual node names (only their local names I guess), but xpath is so poorly implemented in ExtendScript that you risk spending entire nights before finding a syntax (if any!) that could get around this difficulty.

 

Maybe the best option would be to remove the whole xmlns stuff before entering the XML constructor (it's easy to detect and remove it at the string level), then to re-insert that very namespace if you need to output some XML file at the end of your process (?)

 

There are certainly syntactic shortcuts to still access namespaced nodes, but I don't know them. If you can't bypass xmlns, the only solution I can see is the explicit and formal use of QName patches through the following scheme:

 

// Get the ns and create a dedicated QName function.
var ns = '' + myXML.namespace();
var f = function(nme){ return QName(ns,nme) };

// Then use f('nodeName') wherever needed.
// Boring? I know!

 

Now here are some tests you can run based on the above method:

 

// Content of your XML file
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="Automation Form Test 1.pdf"/>
<fields>
<field name="File_1"><value>ABCDEF</value></field>
<field name="File_2"><value>Some Text</value></field>
<field name="File_3"><value>Some more text</value></field>
</fields>
<ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"/>
</xfdf>'''
;

var xmlData = XML(xmlFile);

// Create your dedicated QName function.
var ns = ''+xmlData.namespace();
var f = function(nme){ return QName(ns,nme) };
var x;

// Get the <value> from all fields
x = xmlData.descendants( f('value') );
alert( x.toXMLString() );         // show nodes
alert( x.text().toXMLString() );  // show inner texts

// Get the <value> node of the 2nd <field> i.e. index 1.
x = xmlData.descendants( f('field') )[1].descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text().toXMLString() );  // show inner text

// Select the <field> whose @name is 'File_3' and gets its <value>.
x = xmlData.descendants( f('field') ).xpath("node()[@name='File_3']").descendants( f('value') );
alert( x.toXMLString() );         // show node
alert( x.text() );                // show inner text

 

No doubt much better approaches exist. Hopefully our colleagues will outline them below.

 

Best,

Marc

Robert at ID-Tasker
Brainiac
March 1, 2024

Or if XML imports properly - just iterate through XMLElements collection of the document? 

 

Robert at ID-Tasker
Brainiac
February 29, 2024

Shouldn't be:

//alert(xmlData.xpath("//field/value"));

 

https://www.w3schools.com/xml/xpath_syntax.asp

 

immijp49056642
Participating Frequently
February 29, 2024
Hi Robert,
 
Good pickup, yes I had accidentally used a period instead of forward slash in mt examples.
However I have tried both and still returned no value; see updated code below:
 
try{
 
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve"
><f href="Automation Form Test 1.pdf"
/><fields
><field name="File_1"
><value
>ABCDEF</value>
</field
><field name="File_2"
><value
>Some Text</value
></field
><field name="File_3"
><value
>Some more text</value
></field
></fields
><ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"
/></xfdf
>''';
 
xmlFile.encoding ="UTF-8";
var xmlData = (XML(xmlFile));
 
//below returns all node values OK:
//alert(xmlData.xpath("//*"));
 
//below returns a node value by index OK:
//alert(xmlData.xpath("//*[1]"));
 
//this gets all nodes and attributes OK:
//alert(xmlData.xpath("//node()"));
 
//this should get the value from all fields but returns nothing:
//alert(xmlData.xpath("//field/value"));
 
//this should get the value of the first 'field' node but returns nothing:
//alert(xmlData.xpath("//field[1]/value"));
 
//below should return ABCDEF but returns nothing:
//alert(xmlData.xpath("//field[@name='File_1']/value"));
 
//tried an alternative method but it returns undefined is not an object:
//var xmlStuff = xmlData.xmlElements[0];
//var getData = xmlStuff.evaluateXPathExpression("//*");
//alert(getData);
 
}catch(error){
alert("Error" + error + "\nScript stopped.");
exit();
}
Robert at ID-Tasker
Brainiac
February 29, 2024

I'm sorry but I can't help you any further - I'm not JS guy. 

 

I've just used Google to find some info - but not to the point that I could troubleshoot your code.