Copy link to clipboard
Copied
As some of you may know, InDesign desktop struggles when importing large XML data sets. (indesign-discussions/working-with-very-large-xml-data-sets). In order to get around this I am able to use a variation of the ExtractSubset script to import xml into separate files and copy the elements I need into the main file (Move a set of XML elements to another file). The script referenced in that discussion works. It creates a primary file and imports the first xml chunk, then creates a second temp indesign file, imports the second chunk of XML there, copies the sub-elements over into the promary file and then closes the temp file. So far so good.
However, when I try to loop the import process over a folder containing four xml files, the loop reaches the second xml file and then imports and copies that data forever!
What am I missing in my for loop code?
if (dataPath != null) {
// get xml-files from choosen folder
var myFiles = dataPath.getFiles("*.xml");
for (i = 0; i < myFiles.length; i++) {
var tmp = app.open(File(templateFolder + "/KPCO_ProviderListing.indt"));
var XMLDest = tmp;
setXMLPrefs();
XMLDest.xmlElements.item(0).importXML(File(myFiles[i].fullName));
var pl = app.open(File(pl_path));
var myRuleSet = new Array(new ExtractSpecialties);
var myMarkupTag = pl.xmlTags.itemByName("providerSpecialtiesList");
var myContainerElement = pl.xmlTags.itemByName("providerSpecialtiesList");
with (tmp) {
var elements = xmlElements;
__processRuleSet(elements.item(0), myRuleSet);
}
tmp.close(SaveOptions.NO);
// pl.close(SaveOptions.YES);
}
}
else {
alert("No matching folder found")
};
//![Extract subset - functions.]
function ExtractSpecialties() {
var pl = app.open(File(pl_path));
var myNewElement;
this.name = "ExtractSpecialties";
this.xpath = "/Root/providerRolesList/providerRoleEntry/providerSpecialtiesList/providerSpecialty";
this.apply = function (myElement, myRuleProcessor) {
with (myElement) {
if (myElement.isValid) {
myNewElement = myElement.duplicate();
myNewElement.move(LocationOptions.atEnd, pl.xmlElements.item(0).xmlElements.item(-1).xmlElements.item(-1).xmlElements.item(-1));
}
}
return true;
}
};
1 Correct answer
I got it working!
It turns out that using "i" for the foor loop was conflicting with the "glue code.jsx" file and causing the infinite loop. By using "x" instead, the script below runs perfectly!
// Need to run this mySnippet function for each NUMBERED XML File
// get xml-files from choosen folder
const myFiles = dataPath.getFiles("*.xml");
// using "x" because "i" conflicts with the gluecode and causes an infinite loop
for (x = 0; x<myFiles.length; x++){
var dataFile = myFiles[x].fullName
...
Copy link to clipboard
Copied
What is "pl_path" here:
var pl = app.open(File(pl_path))
Copy link to clipboard
Copied
var mainFolder = Folder.desktop + "/Indesign Automation POC";
var templateFolder = mainFolder + "/Template Files - CO";
var outputFolder = mainFolder + "/ID_ScriptOutput";
var bookFolder = new Folder(outputFolder + "/" + monthName + "-" + year + "_" + RGN + "-" + LOB + "_PD_MFX");
bookFolder.create();
var dataPath = Folder(mainFolder + "/Data/CO_DNB-XML_TEST_SPLIT");
var pl_path = bookFolder + pl_FileName;
//Create Provider Listing file
function plSetup() {
var new_pl = app.open(File(templateFolder + "/KPCO_ProviderListing.indt"));
//Set XMP Info
addMetaData();
new_pl.save(File(pl_path));
var XMLDest = new_pl;
setXMLPrefs();
XMLDest.xmlElements.item(0).importXML("C:/Users/O146962/OneDrive - Kaiser Permanente/Desktop/Indesign Automation POC/Data/CO_DNB-XML_TEST_SPLIT/CO_DNB_Providers_Test_1.xml");
//new_pl.close();
};
plSetup();
Copy link to clipboard
Copied
Ok, but now what is the pl_FileName in:
var pl_path = bookFolder + pl_FileName;
Sorry if I'm chasing wrong lead but your original code looks to open correct XML file?
Can you describe "in your own words" what your code suppose to do at each step?
Copy link to clipboard
Copied
@Robert at ID-Tasker no problem at all!
The folder vars at that top of the document contain the filepaths to the primary, output, and input folders. The "pl_path" contains the filepath to the "provider list" file, which is the primary destination file for all of the xml.
The script creates a new_pl file by opening a template file, it then updates the metadata for that file, saves it, sets the xml preferences and then imports the first XML file.
The for loop SHOULD:
- Check to see if the data folder exists
- If it exists, get the xml files from the data folder and load them into an array called "myFiles"
Then, for each item in the my files array:
- Create a new InDesign File by opening a template file
- Call this file "tmp" (it will not be saved)
- Set the xml preferences on that file
- Import the XML file that matches the index of this loop sequence
- Open the pl file
- Extract the "provider specialty" elements from the tmp file
- Add the extracted elements to the end of the "providerSpecialtiesList" elements in the pl file
- Close the tmp file without saving
- Repeat for each file in the data folder
Copy link to clipboard
Copied
In VB I can do step by step execution - not sure if you can do it in JS as ESTK is no longer "available" - so maybe you can add extra logging functionality and check value of each variable at every step - to see where it gets wrong value?
Because for me - from what you are saying - after 2nd file your script is getting wrong / old info for the 3rd and next files?
So 4th or 5th step?
Unless you're modifying pl_path / pl_FileName at a later stage?
Copy link to clipboard
Copied
I got it working!
It turns out that using "i" for the foor loop was conflicting with the "glue code.jsx" file and causing the infinite loop. By using "x" instead, the script below runs perfectly!
// Need to run this mySnippet function for each NUMBERED XML File
// get xml-files from choosen folder
const myFiles = dataPath.getFiles("*.xml");
// using "x" because "i" conflicts with the gluecode and causes an infinite loop
for (x = 0; x<myFiles.length; x++){
var dataFile = myFiles[x].fullName
if (myFiles.length>1){
xmlImportLoop(dataFile);
alert("File " + dataFile + " [ " + x + " ]" + " processed");
}
else{
alert("Array only has " + myFiles.length + " entry");
break;
};
function xmlImportLoop() {
var tmp = app.open(File(templateFolder + "/KPCO_ProviderListing.indt"));
var XMLDest = tmp;
setXMLPrefs();
alert ("Current file is" + dataFile + " [ " + x + " ]");
XMLDest.xmlElements.item(0).importXML(File(dataFile));
var pl = app.open(File(pl_path));
var myRuleSet = new Array(new ExtractSpecialties);
var myMarkupTag = pl.xmlTags.itemByName("providerSpecialtiesList");
var myContainerElement = pl.xmlTags.itemByName("providerSpecialtiesList");
with (tmp) {
var elements = xmlElements;
__processRuleSet(elements.item(0), myRuleSet);
};
alert ("x = " + x);
tmp.close(SaveOptions.NO);
pl.save();
// pl.close(SaveOptions.YES);
alert("Loop " + x + " finished");
};
//![Extract subset - functions.]
function ExtractSpecialties() {
var pl = app.open(File(pl_path));
this.name = "ExtractSpecialties";
this.xpath = "/Root/providerRolesList/providerRoleEntry/providerSpecialtiesList/providerSpecialty";
this.apply = function (myElement, myRuleProcessor) {
with (myElement) {
if (myElement.isValid) {
myNewElement = myElement.duplicate();
myNewElement.move(LocationOptions.atEnd, pl.xmlElements.item(0).xmlElements.item(-1).xmlElements.item(-1).xmlElements.item(-1));
}
}
return true;
}
}};
Copy link to clipboard
Copied
That's great.
Copy link to clipboard
Copied
Hi @Ben Ross - KP, glad you got it working, but I don't think you got the real answer.
Just for learning, I just wanted to point out that the problem you allude-to is called polluting the global scope.
When you do
for (x = 0; x<myFiles.length; x++){
You are declaring a global var x. This means that if x was already used by something, you may damage their script working by changing it, and that their script can change it and wreck your script. And it is completely unnecessary, and is only "needed" because of an awkward design in your structure.
For example, changing your structure to something like this will help a lot:
function main() {
var myFiles; /* get myFiles */
var results;
for (var i = 0; i < myFiles.length; i++) {
var myFile = myFiles[i];
results.push(processFile(myFile));
}
};
app.doScript(main, ScriptLanguage.JAVASCRIPT, undefined, UndoModes.ENTIRE_SCRIPT, 'Process XML');
/**
* Does x, y, z to file.
* @param {File} file
* @returns {Number} - the result code.
*/
function processFile(file) {
/* do the processing */
return resultCode;
};
Also I have strong doubts about your diagnosis—glue code.jsx, doesn't use the var i, anywhere. So your problem of infinite loop was probably something else.
You are doing an awesome job of getting your script working, and I am just giving some pointers along the way. Hope you don't mind.
- Mark
Copy link to clipboard
Copied
Thanks @m1b , I really appreciate the additional feedback and insight!
Regarding the glue code, here is where the conflit was occuring:
function __makeRuleProcessor(ruleSet, prefixMappingTable){
// Get the condition paths of all the rules.
var pathArray = new Array();
for (i=0; i<ruleSet.length; i++)
{
pathArray.push(ruleSet[i].xpath);
}
// the following call can throw an exception, in which case
// no rules are processed
try{
var ruleProcessor = app.xmlRuleProcessors.add(pathArray, prefixMappingTable);
}
catch(e){
throw e;
}
var rProcessor = new ruleProcessorObject(ruleSet, ruleProcessor);
return rProcessor;
}
That said, I am SURE you are correct about my poor variable scoping. So I defnitely need improve that and I really appreciate your guidance and advice!
Copy link to clipboard
Copied
HI @Ben Ross - KP, well the sample code you show here does have a globally scoped i, so *you* are correct! 🙂
But, that might be an old glue code. The one that I found in my Indesign 2023 scripts folder has this line instead:
for (var ruleSetIndex=0; ruleSetIndex<ruleSet.length; ruleSetIndex++)
so it looks like someone has cleaned it up.
- Mark
Copy link to clipboard
Copied
Thanks @m1b! If there is an updated glue code you can point me to that would be a big help! It was not included in the current Scripting SDK!
Copy link to clipboard
Copied
See attached. It should be in your current Indesign 2023 folder though.
Copy link to clipboard
Copied
Thanks @m1b !

