Skip to main content
Participant
May 21, 2026
Answered

Illustrator and scripting inconsistent words[] when tabs used

  • May 21, 2026
  • 3 replies
  • 41 views

Can someone explain the results when the following is run on an Illustrator document with one text frame?

var myDoc = app.activeDocument;
var myTextFrame = myDoc.textFrames[0]
myTextFrame.contents = "Lorem ipsum dolor sit amet\rLorem\tipsum\tdolor\tsit\tamet";

alert(myTextFrame.words.length);
alert(myTextFrame.lines[0].words.length);
alert(myTextFrame.lines[1].words.length);

for (l=0; l<2; l++)
{
concat = "#";
for (i=0; i<myTextFrame.lines[l].words.length; i++)
{
concat += myTextFrame.lines[l].words[i].contents;
concat += "#";
}
alert(concat);
}

14

5

9

#Lorem#ipsum#dolor#sit#amet#

#Lorem#Lorem#ipsum#ipsum#dolor#dolor#sit#sit#amet#

Why does separating the words by tabs seem to duplicate all but the last word in the line?

Is this a bug, and if not, what is the logic behind it?

I believe I have the latest version of Illustrator downloaded via CC

    Correct answer jduncan

    Tab seems to be considered a character/word when included with other characters (a-z, etc.).

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "a\t\t\t\tb";

    alert(myTextFrame.words.length);
    // result is 6

    Please note, when the textframe is just whitespace (like below), the tab characters are not counted as words...

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "\t\t\t\t";

    alert(myTextFrame.words.length);
    // result is 0

    Except if you separate with them spaces…

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "\t \t \t \t";

    alert(myTextFrame.words.length);
    // result is 3

    And I’m sure there are other weird edge cases. Unfortunately, the API docs don’t detail how AI handles whitespace so you would have to test for all edge cases if you needed this to function correctly in production.

     

    You could count the words yourself with a custom function instead of relying on the `.words` method.

    function countRealWords(tf) {
    var s = tf.contents || "";
    var matches = s.match(/[A-Za-z0-9]+(?:['’-][A-Za-z0-9]+)*/g);
    return matches ? matches.length : 0;
    }

     

    3 replies

    CarlosCanto
    Community Expert
    Community Expert
    May 27, 2026

    try this snippet

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "Lorem ipsum dolor sit amet\rLorem\tipsum\tdolor\tsit\tamet";

    alert(getWordCount(myTextFrame.contents));
    alert(getWordCount(myTextFrame.lines[0].contents));
    alert(getWordCount(myTextFrame.lines[1].contents));


    function getWordCount(text) {
    // split by any sequence of whitespace characters
    const words = text.split(/\s+/);

    alert(words.join("**\n"));

    // Return 0 if the original string was empty, otherwise return the array length
    return words[0] === "" ? 0 : words.length;
    }

     

    Participant
    May 27, 2026

    The trouble is that the word count is just an example of where the API model is not working. I actually want to potentially change the colour of various words on each line, depending on other words in the line. If that reported second word also shows up, incorrectly, as the same as the first word, and the real second word shows up as the third and fourth, then, even if I code to work around this, I have no confidence that my code will work in future versions.

    jduncan
    Community Expert
    Community Expert
    May 27, 2026

    As I iterate over the “words” from the API, I would check to see if the word contents were just whitespace and if so, skip/ignore them (or offset a counter). Without more details, I can’t really provide a concrete code example.

    jduncan
    Community Expert
    jduncanCommunity ExpertCorrect answer
    Community Expert
    May 22, 2026

    Tab seems to be considered a character/word when included with other characters (a-z, etc.).

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "a\t\t\t\tb";

    alert(myTextFrame.words.length);
    // result is 6

    Please note, when the textframe is just whitespace (like below), the tab characters are not counted as words...

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "\t\t\t\t";

    alert(myTextFrame.words.length);
    // result is 0

    Except if you separate with them spaces…

    var myDoc = app.activeDocument;
    var myTextFrame = myDoc.textFrames[0];
    myTextFrame.contents = "\t \t \t \t";

    alert(myTextFrame.words.length);
    // result is 3

    And I’m sure there are other weird edge cases. Unfortunately, the API docs don’t detail how AI handles whitespace so you would have to test for all edge cases if you needed this to function correctly in production.

     

    You could count the words yourself with a custom function instead of relying on the `.words` method.

    function countRealWords(tf) {
    var s = tf.contents || "";
    var matches = s.match(/[A-Za-z0-9]+(?:['’-][A-Za-z0-9]+)*/g);
    return matches ? matches.length : 0;
    }