CS3 InDesign: need DOM help finding string with particular character style

Community Beginner ,
Apr 16, 2021 Apr 16, 2021

Copy link to clipboard

Copied

I'm trying to write a simple filter in jsx to reduce handwork, but I'm apparently missing something.

 

I'm trying to find embedded substrings in paragraphs that have a certain character style applied to them.  So the target substring might be "foo" with the char style "cite" and the surrounding text "othertext" so it goes "othertext othertext othertext foo othertext othertext...."    I've tried various ways to say, with DOM refs, "find some substring with charstyle "cite" and return the substring and the page number where found".  I can't seem to figure out how to do that. 

 

I also needed to find paragraph headers for the TOC, but that worked the way I expected.  But finding embedded substrings that have a certain char style doesn't seem to work the same way at all.

 

Any help greatly appreciated.

 

Margaret

TOPICS
How to, Scripting

Views

158

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 16, 2021 Apr 16, 2021

Copy link to clipboard

Copied

I shouldn’t think this 14 year old version of InDesign would work with anything.

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Apr 17, 2021 Apr 17, 2021

Copy link to clipboard

Copied

Surprisingly, it does still work quite well.  Of course I'm running it under Win 7, so that undoubtably helps.

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 16, 2021 Apr 16, 2021

Copy link to clipboard

Copied

Hi Margaret,

you could use the Find/Change GREP functionality of InDesign CS3.

 

It's documented for ExtendScript (JavaScript) scripting here:

http://jongware.mit.edu/indesigncs3jshelp/

 

See for everything GREP here:

http://jongware.mit.edu/indesigncs3jshelp/pc_Application.html

http://jongware.mit.edu/indesigncs3jshelp/pc_FindGrepPreference.html

 

Can you tell a GREP pattern that would find "foo" inside this:

"othertext othertext othertext foo othertext othertext...." ?

 

Do you want to find all "foo"s with character style "cite"?

What do you like to do with the found text?

 

Regards,
Uwe Laubender

( ACP )

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Apr 17, 2021 Apr 17, 2021

Copy link to clipboard

Copied

Yes, I want to find all "foo"s with the style "cite", by page.   Here's the example text I'm using for testing:

 

This is a paragraph with Foo v Bar, 123 Mass. 500 (2021) with regular text before and after

 

What I'm trying to do is to find every string styled as "cite" (Foo v Bar, 123 Mass. 500 (2021) in the test doc) and then assemble them into an Authorities page or pages.  The Authorities page(s) look(s) like a TOC in format (the text of the cite on the left, the list of page numbers on the right) but is more like an index, since a cite can appear on potentially many pages.

 

I'd like to search for the characterstyle because the text to which it's applied (the body of the cite) will vary.  Being able to search on the characterstyle rather than the cites themselves would be a more general solution.  When adjusting sections and lines in the orginal document, which I did by hand since I had no time to spend studying the DOM,  re-creating the Authorities pages again and again almost drove me mad.  

 

Brute-forcing it would work in this particular instance by just searching for the text substrings, since I know what the cites are and I can search in a per-page loop.  But it seems odd not to be able to search on char styles when I can search on paragraph styles.  So I've been supposing that I'm just making some kind of simple mistake due to how complex the DOM appears to be and how little documentation there is that covers finding and extracting data from existing InDesign documents.  I'd guess from past experience that few people want to create documents under program control, though that's what the Tutorial and Scripting Guide focus on.

 

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 18, 2021 Apr 18, 2021

Copy link to clipboard

Copied

I don’t think this is using anything that wasn’t in the CS3 API, so it should work:

 

 

var st = app.activeDocument.stories
var cs = app.activeDocument.characterStyles.itemByName("cite");
var sw = "foo";
var str = "Foos Found:\r"
var w;
for(var i=0; i < st.length; i++){  
    w = st[i].textStyleRanges; 
    for(var j=0; j < w.length; j++){  
        if (w[j].appliedCharacterStyle == cs && textSearch(sw, w[j]) != "") {
            str += w[j].contents + "\tPage: " + w[j].parentTextFrames[0].parentPage.name + "\r"
        } 
    }  
} 

alert(str);



function textSearch(fnd, txt){
    app.findTextPreferences.findWhat=NothingEnum.NOTHING
    app.changeTextPreferences.changeTo=NothingEnum.NOTHING
    app.findTextPreferences.findWhat = fnd;
    return txt.findText()
}

 

 

 

 

Screen Shot 42.png

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Apr 18, 2021 Apr 18, 2021

Copy link to clipboard

Copied

Hi Uwe, Rob

 

Big thanks to both of you, and abject apologies for not returning sooner---the penny finally dropped for me last night and I've been scrambling to clean up the code, which looks a dreadful kludge at the moment.  I'll post the whole thing once I have it sorted in case anyone else wants to do something similar without necessarily reinventing the entire wheel.  

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 19, 2021 Apr 19, 2021

Copy link to clipboard

Copied

"Yes, I want to find all "foo"s with the style "cite", by page."

 

Hi Margaret,

that's easy with GREP Find/Change in ExtendScript scripting. Just like you do it in the user interface. The only thing you have to be aware is that some parts of a GREP pattern requires double escaping.

 

Below some minimal code that results in an array of found formatted texts. In my example very likely text objects that are word objects. Result array contains e.g. [object Word],[object Word],[object Word] if three items of text are found.

 

var resultArrayFormattedText = [];

app.findGrepPreferences = null;

var doc = app.documents[0];
var grepPattern = "foo" ;
var charStyleToFind = doc.characterStyles.itemByName("cite");

app.findGrepPreferences.properties = 
{
	findWhat : grepPattern ,
	appliedCharacterStyle : charStyleToFind
};

resultArrayFormattedText = doc.findGrep();

 

After you ran the code with app.findGrepPreferences inspect the GREP tab in your UI.

It should look like that ( from my German InDesign CS6 )

FindGREP-foo-cite-CS6.PNG

 

That means, you could use the user interface to test a GREP pattern and read out its properties for ExtendScript with e.g.

app.findGrepPreferences.findWhat

to see how the findWhat string should look like in your code.

 

Regards,
Uwe Laubender

( ACP )

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 19, 2021 Apr 19, 2021

Copy link to clipboard

Copied

What's missing is to sort the result array by page.

InDesign CS3 has the parentTextFrames array for texts that you can use to get the text frame(s) of a found text.

But unfortunately InDesign CS3 is missing the parentPage property of a text frame. That was introduced with InDesign CS5. So you need to work around this and write your own function to get the page of a text frame.

You could do it by accessing all text frames in the document by page and compare a text frame with a found one for example.

 

Regards,
Uwe Laubender

( ACP )

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Adobe Community Professional ,
Apr 19, 2021 Apr 19, 2021

Copy link to clipboard

Copied

Here's a function, written originally by Dave Saunders, that returns a page item's parent page:

 

function findPage (o) {
  if (o.hasOwnProperty ('baseline')) {
    o = o.parentTextFrames[0];
  }
  while (o != null) {
    if (o.hasOwnProperty ('parentPage')) {
      return o.parentPage;
    }
    switch (o.constructor) {
      case Page : return o;
      case Character : o = o.parentTextFrames[0]; break;
      case Cell : o = o.insertionPoints[0].parentTextFrames[0]; break;
      case Note : ; case Footnote : o = o.storyOffset; break;
      case Application : return null;
    }
    if (!o) return null;
    o = o.parent;
  }
  return o;
}

Likes

translate

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines