Iterating over all text?
I'd like to iterate over all the text in a document (inside groups, tables, etc., etc.) and not miss any bizarre corner cases.
I thought I had seen a script from Marc Autret that addressed this, but I couldn't find it (instead, I found the [JS][CS3] Getting Page Number but I thread, which basically goes the other way).
I recently discovered that the method I had been using misses text inside tables. And the version I wrote a while ago initially missed items inside groups. So I'm wondering if somone has a tried-and-true function that does this kind of thing.
Hre's what I have -- this works fine without tables. Looks for "@@" in any textbox in the document:
var i,j;
for (i=0; i<doc.pages.length; i++) {
var p = doc.pages;
for (j=0; j<p.masterPageItems.length; j++)
check_at("Master on p."+p.name, p.masterPageItems);
for (j=0; j<p.pageItems.length; j++)
check_at("On p."+p.name, p.pageItems);
}function check_at(name, pi) {
if (debug) $.writeln(pi.constructor.name+" on "+name);
if ('contents' in pi &&
pi.contents.match("@@")) {
var i = pi.contents.indexOf("@@");
var s = Math.max(i-23,0);
var e = Math.min(i+23,s+37);
lines.push(name+": "
+pi.contents.substring(s,e).replace(/\r/g,"\\n"));
}
if ('pageItems' in pi) // recurse into groups
for (var k=0; k<pi.pageItems.length; k++)
check_at(name+"", pi.pageItems );
}
but this fails on tables, because a TextFrame's contents property does not return the contents of a table.
(I also realized today that the group handling could be ignored if I just used "allPageItems" instead of "pageItems").
Anyhow, I guess I could also iterate over pi.tables and for each one, check .contents.join("\n").match("@@"). Since the contents of a table are an array, that would be joining all the cells together into one string and searching that string.
But I'm worried this is insufficiently robust? And it certainly is ugly.
Any good experience on this sort of thing? Thanks.