Copy link to clipboard
Copied
I am trying to use an InDesign jsx file to read an xfdf file which has been exported from a form in Adobe Acrobat. I want to be able to find nodes by name attribute and return the value as a variable.
I am following the xpath syntax in XPath Syntax (w3schools.com) which seems quite simple.
I am able to get a response from xpath using wildcards ("//*) but as soon as I try to search for anything specific it returns nothing ("//field.value") or "//field[@name='File_1']/value".
I am sure I am close but cannot work out why these queries are coming back empty.
Below is a condensed version of my code with the XML data included as a variable for brevity.
At the end are various xpath commands and my comment on the outcome of eadch.
They can be un-commented to test.
Welcome to the club, you are entering the ExtendScript XML Nightmare 😉
I am not skilled enough in this area—Dirk and/or Loïc could probably help you, should they come across this discussion— but I strongly suspect that the biggest part of your problem is that your XML has all its elements namespaced due to the xmlns instruction at the root node:
<xfdf xmlns="http://ns.adobe.com/xfdf/" …>
This means that in expressions like xmlData.xpath("//field/value"), ‘field’ and ‘val
...Copy link to clipboard
Copied
Shouldn't be:
//alert(xmlData.xpath("//field/value"));
https://www.w3schools.com/xml/xpath_syntax.asp
Copy link to clipboard
Copied
Copy link to clipboard
Copied
I'm sorry but I can't help you any further - I'm not JS guy.
I've just used Google to find some info - but not to the point that I could troubleshoot your code.
Copy link to clipboard
Copied
Thanks, no problem Robert
Copy link to clipboard
Copied
Welcome to the club, you are entering the ExtendScript XML Nightmare 😉
I am not skilled enough in this area—Dirk and/or Loïc could probably help you, should they come across this discussion— but I strongly suspect that the biggest part of your problem is that your XML has all its elements namespaced due to the xmlns instruction at the root node:
<xfdf xmlns="http://ns.adobe.com/xfdf/" …>
This means that in expressions like xmlData.xpath("//field/value"), ‘field’ and ‘value’ are not the actual node names (only their local names I guess), but xpath is so poorly implemented in ExtendScript that you risk spending entire nights before finding a syntax (if any!) that could get around this difficulty.
Maybe the best option would be to remove the whole xmlns stuff before entering the XML constructor (it's easy to detect and remove it at the string level), then to re-insert that very namespace if you need to output some XML file at the end of your process (?)
There are certainly syntactic shortcuts to still access namespaced nodes, but I don't know them. If you can't bypass xmlns, the only solution I can see is the explicit and formal use of QName patches through the following scheme:
// Get the ns and create a dedicated QName function.
var ns = '' + myXML.namespace();
var f = function(nme){ return QName(ns,nme) };
// Then use f('nodeName') wherever needed.
// Boring? I know!
Now here are some tests you can run based on the above method:
// Content of your XML file
var xmlFile = '''<?xml version="1.0" encoding="UTF-8"?>
<xfdf xmlns="http://ns.adobe.com/xfdf/" xml:space="preserve">
<f href="Automation Form Test 1.pdf"/>
<fields>
<field name="File_1"><value>ABCDEF</value></field>
<field name="File_2"><value>Some Text</value></field>
<field name="File_3"><value>Some more text</value></field>
</fields>
<ids original="00054952DF51F14FB50D88332A014639" modified="B16CF70A22E4F341A0F65B6E7D370758"/>
</xfdf>'''
;
var xmlData = XML(xmlFile);
// Create your dedicated QName function.
var ns = ''+xmlData.namespace();
var f = function(nme){ return QName(ns,nme) };
var x;
// Get the <value> from all fields
x = xmlData.descendants( f('value') );
alert( x.toXMLString() ); // show nodes
alert( x.text().toXMLString() ); // show inner texts
// Get the <value> node of the 2nd <field> i.e. index 1.
x = xmlData.descendants( f('field') )[1].descendants( f('value') );
alert( x.toXMLString() ); // show node
alert( x.text().toXMLString() ); // show inner text
// Select the <field> whose @name is 'File_3' and gets its <value>.
x = xmlData.descendants( f('field') ).xpath("node()[@name='File_3']").descendants( f('value') );
alert( x.toXMLString() ); // show node
alert( x.text() ); // show inner text
No doubt much better approaches exist. Hopefully our colleagues will outline them below.
Best,
Marc
Copy link to clipboard
Copied
Or if XML imports properly - just iterate through XMLElements collection of the document?
Copy link to clipboard
Copied
Hi Marc,
Thank you for the detailed response. You have hit the nail on the head in your first paragraph - as soon as I deleted the namespace from the XML file (xmlns="http://ns.adobe.com/xfdf/"), I was able to get the expected responses to all of my queries. Because I don't need to output XML from this script I think I will simply remove the namespace as a first step in the script, but will keep your secondary suggestion on hand in case. Thanks again!
Copy link to clipboard
Copied
Hi Marc,
I'm not sure if I responded to the you properly, here it is again just in case. Thank you for the detailed response. You have hit the nail on the head in your first paragraph - as soon as I deleted the namespace from the XML file (xmlns="http://ns.adobe.com/xfdf/"), I was able to get the expected responses to all of my queries. Because I don't need to output XML from this script I think I will simply remove the namespace as a first step in the script, but will keep your secondary suggestion on hand in case. Thanks again!
Copy link to clipboard
Copied
Hello,
the namespace was the right hint. The elements here are in the default namespace. Alternatively, you can also set it:
setDefaultXMLNamespace(xmlData.name().uri);
and then reset it e.g. in finally:
} finally {
setDefaultXMLNamespace("");
}
Copy link to clipboard
Copied
Thanks Roland, this might be neater than replacing the namespace after reading in the XML.
Cheers