Skip to main content
Inspiring
April 7, 2023
Question

Working with very large XML data sets

  • April 7, 2023
  • 4 replies
  • 2175 views

I am working on building documents from very large XML data sets consisting of close to 100k records, where each record has over a dozen nested data points. Is there a way to speed up InDesign's processing time for files like this? 

 

When I tested a file with 4k records InDesign manages to build the document in about 2 min. But with an 88k record file it has been running for over 20 hours and still only shows as halfway through!

This topic has been closed for replies.

4 replies

Inspiring
April 17, 2023

Thanks for all the feedback here! As I mentioned below, I ended up breaking up the data into smaller chunks for import. During the process, it became clear that Indesign's handling of larger data sets is not linear. Importing 1k records took about 2min, 2k took 10 min, 3k took 35 min. 

After breaking up the data into smaller chunks, I was able to combine all of the imported data into a single document. Indesign had no issues at all working with the final document containing all of the combined data. In fact, all told, it only came out to around 130 pgs. 

All of this tells me that the issue here is with how Indesign handles the data import. If manually chunking the data fixes the issue, then Indesign should change the way they handle the import process to have the system do that data chunking automatically.

Robert at ID-Tasker
Brainiac
April 17, 2023

Glad you were able to finish the job. 

 

I can bet that the problem is with unlimited undo - I'm sure someone can write you a small script in JavaScript that first will disable the undo and then import your data - sorry, but I'm VB6 & Win man 😉

 

Robert at ID-Tasker
Brainiac
April 10, 2023

What exactly are you trying to import? 

 

Maybe XML in your case isn't a right way? 

 

Inspiring
April 10, 2023

InDesign crashed and the import failed. Trying a couple of different approaches next: 1. Running the import on a beefier machine. 2. Breaking teh dataset up into 10k chuncks for import. I'll post here on how it goes.

James Gifford—NitroPress
Brainiac
April 10, 2023

Assuming a system is basically capable of running InDesign, I think the only parameter that might affect an import like this is available RAM. (That is, GPU would be irrelevant, and more CPU power is not likely to make much of a difference except perhaps in overall processing time. But RAM would be critical, and this assumes there's plenty of scratch disk space as well.) If the system that's failed has 16GB or less, I'd seek a 32GB system to try above all other characteristics, and make sure there is plenty of free disc space as well — a free TB is probably not unreasonable.

 

And at that, the scale of the work may simply be past ID's capabilities. running into hard limits on number of objects, etc. But more RAM is the right direction to try again.

 

James Gifford—NitroPress
Brainiac
April 7, 2023

I only have modest experience with XML in InDesign, but my first thought is that an XML file of that scale is simply beyond ID's grasp. InDesign does have limits on the number of component files, documents, pages, cross references and so forth and while these limits are mostly quite generous (and accommodate even most advanced projects), I'd bet that such a huge multilayered structure is more than it can handle.

 

There's better expertise here, and either a better answer or a workable process might be forthcoming. But I'd start looking into ways to break the job into more manageable elements, perhaps a subset of the XML file into individual ID chapters.