Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

testing for special characters? (JS Win CS5.5)

Explorer ,
Jul 25, 2016 Jul 25, 2016

Is there some sort of boolean or function to tell if a character is a member of the group of SpecialCharacters?

I need to test whether a range of text ends in a carriage return & had used a loop that tested the last character of found items with myFoundItems.characters[-1].contents.charCodeAt() == 13, which was all well and good except when the last character happens to be a special character. Since special characters don't properly return a character code, the script errors out. I saw useful posts on filtering out & translating special characters to unicode, but nothing on a straightforward test.

Maybe I'm still a rube enough to miss it in Jongware's reference...my skills are such that it looks like I'd need to identify and test all possibilities (ugh!). Seems like it'd come up enough that someone might have a handy function laying around...

Thanks for any wisdom on this point.

TOPICS
Scripting
3.5K
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Guide , Jul 26, 2016 Jul 26, 2016

Hi,

1. To check whether myChar.contents is a SpecialCharacters item, I think you can simply use the test

( 'object' == typeof myChar.contents )

since in any other case typeof will return 'string' (assuming myChar is a valid Character instance.)

2. Anyway, you can always and safely convert a single Character (or any Text) into a pure JS string using

myChar.texts[0].contents

This trick prevents the contents property from returning a SpecialCharacters item even if myChar is a 'special character.'

More on

...
Translate
Explorer ,
Jul 26, 2016 Jul 26, 2016

I found a test, but I'd like to know if it is ill-advised. Evaluating the .valueOf a special character returns a 10 digit number, so this code just makes the number a string & checks the length. Since normal text & lower-order ASCII, like tabs or returns, are single character strings, they evaluate as a length of 1.

for (i=myFoundItems.length-1; i>-1; i-- ) {

  if ((myFoundItems.characters[-1].contents.valueOf() + "").length < 10) {

    if (myFoundItems.characters[-1].contents.charCodeAt() == 13) {

          // now safely determined this character's a return...

    }

  }

}

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jul 26, 2016 Jul 26, 2016

Hi,

1. To check whether myChar.contents is a SpecialCharacters item, I think you can simply use the test

( 'object' == typeof myChar.contents )

since in any other case typeof will return 'string' (assuming myChar is a valid Character instance.)

2. Anyway, you can always and safely convert a single Character (or any Text) into a pure JS string using

myChar.texts[0].contents

This trick prevents the contents property from returning a SpecialCharacters item even if myChar is a 'special character.'

More on InDesign special characters and Unicode issues:

Indiscripts :: InDesign CS4/CS5 Special Characters [Update]

@+

Marc

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jul 26, 2016 Jul 26, 2016

Oops!

In CS5.5 CS4 the SpecialCharacters enumeration wasn't coercing items into objects yet. Numbers were used instead.

So in your case the test looks like

( 'number' == typeof myChar.contents )

But I guess the texts[0] trick was already valid.

@+

Marc

[Edited, thanks to Uwe Laubender]

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 26, 2016 Jul 26, 2016

dsackett,

do you want to detect this kind of special characters?

Adobe InDesign CS5 (7.0) Object Model JS: SpecialCharacters
( There could be others as well. )

Hi Marc,

I just tested with my InDesign CS5.5 on Mac OSX 10.7.5.

var doc = app.documents[0];

var textFrame = doc.textFrames.add({geometricBounds : [0,0,"100mm","100mm"] , contents : "\n"});

textFrame.characters[0].contents

// Result: FORCED_LINE_BREAK

Best,
Uwe

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 26, 2016 Jul 26, 2016

And we can take that a bit further:

var doc = app.documents[0];

var textFrame = doc.textFrames.add({geometricBounds : [0,0,"100mm","100mm"] , contents : "\n"});

var result = textFrame.characters[0].contents;

result.constructor.name

// Result: Enumerator

Uwe

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jul 27, 2016 Jul 27, 2016

Thanks Uwe,

So my 'Oops' was wrong   [Should we cancel a 'oops' using !oops?]

Indeed I had lost the history of InDesign's Enumeration/Enumerator paradigm, which has been introduced in CS5, so the thing is already available in CS5.5 and my original reply should be correct.

And yet we already discussed that specific question here:

Re: Search document for SpecialCharacters Enumerator

@+

Marc

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 27, 2016 Jul 27, 2016

Yes, we already discussed that. And with a reason.
The results for .contents can be a bit "ambiguous". It depends how and what exactly we ask.

See the following example from InDesign CS5.5 on Mac OSX where a text frame of a threaded story is selected. One footnote involved, a quarter space, one column break and a frame break:

StoryWithSomeSpecialCharacters.png

Tested with code below. Results are documented with $.writeln() .

// Text or text frame selected

var story = app.selection[0].parentStory;

// Note: texts[0] is used:

var characters = story.texts[0].characters.everyItem().getElements();

var length = characters.length;

var constructorName = "Enumerator";

for(var n=0;n<length;n++)

{

    var character = characters;

    if(character.contents.constructor.name == constructorName)

    {

        $.writeln

        (

            n

            +"\t"+

            /* Example result: 1399221837 */

            character.contents

            +"\t"+

            /* Example result: Enumerator */

            character.contents.constructor.name

            +"\t"+

            /* Example result: FOOTNOTE_SYMBOL */

            character.contents.toString()

        );

    };

};

Results in the JavaScript Console of the ESTK:

/*

427    1399221837    Enumerator    FOOTNOTE_SYMBOL

476    1397847379    Enumerator    QUARTER_SPACE

514    1396927554    Enumerator    COLUMN_BREAK

822    1397125698    Enumerator    FRAME_BREAK

*/

This time the value of contents is of type Enumerator, but is delivering "numbers".
I had to use the toString() method to get to the "names" of the enumerators.

It could be well, that $.writeln() does its part on this kind of presentation.

Thanks,
Uwe

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Jul 27, 2016 Jul 27, 2016

Hi Uwe,

Weird, isn't it? But there are good reasons that explain all you've seen.

1. First, the texts[0] trick does work only if you use it just before calling the contents property. In your example, the line

  characters = story.texts[0].characters.everyItem().getElements();

returns an array of Character instances, whatever the fact that texts[0] is used upstream. Then

  characters.contents

will still return an Enumerator instance (assuming we are on a special character.) Your texts[0] has no effect because a Character specifier—…characters.everyItem()—has translated back your stuff into pure Character objects.

But if you change your line

  var character = characters;

into

  var character = characters.texts[0];

then the trick will work again, .contents will return strings and the Enumerator issue will vanish.

2. Now, about the fact that the term "blabla" + myEnumerator returns a string based on myEnumerator.valueOf(), i.e. the underlying Number, instead of using myEnumerator.toString(), i.e. the underlying String, this is not connected to the fact that you are using $.writeln(). No, that's a generic BUG of the Enumerator object, which hasn't been implemented with respect to ECMA's specification regarding type conversion.

Normally, given an Object myObj and a literal string "blabla", the string concatenation "blabla" + myObj should implicitly invoke myObj.toString() in order to achieve the proper type conversion before concatenation. By the way in JavaScript we often use the syntax ""+myObj as a shortcut of myObj.toString(). That's a quick way of extracting the string representation of an object.

But, the Enumerator type doesn't follow this rule. I suspect that the + operator hasn't been fully and properly implemented. Used as an unary operator (+myEnumeror) we get a Number, as required. Indeed, +myObj is usually a shorcut of myObj.valueOf(), where the valueOf method is supposed to return a Number. But, regarding the term anything + myObj, the JS interpreter must select either an addition in case anything is a Number (and then myObj.valueOf() is the right guy), or a concatenation (in case anything is a String, and then myObj.toString() should be invoked.)

Problem with ExtendScript is that +myEnumerator means myEnumerator.valueOf(), which is fine, but that myString+myEnumerator is wrongly interpreted myString + myEnumerator.valueOf(), which finally leads to the string concatenation

  myString + (myEnumerator.valueOf()).toString()

instead of

  myString + myEnumerator.toString()

That's why we get the following (and surprising) result:

var specialChar = SpecialCharacters.BULLET_CHARACTER;

alert( specialChar.__class__ );    // => Enumerator

alert( specialChar );              // => BULLET_CHARACTER

alert( '' + specialChar );         // => 1396862068

Line #5 is OK. Since alert() expects a string, specialChar.toString() is implicitly invoked.

But Line #7 is a bug, as explained above.

@

Marc

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Jul 27, 2016 Jul 27, 2016

Wow. Great explanation, Marc!
Thank you very much.
Uwe

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Jul 28, 2016 Jul 28, 2016

Marc & Uwe,

Wow--what a fascinating and edifying discussion--thank you so much!

I am putting the myChar.texts[0].contents trick to use. What I came up with worked OK, but it's most like the #7 emptystring + specialChar example above. Given that is a bug, how ID's Javascript interpreter handles it could change, so is probably not the best basis for code that could have a life beyond our current version. You probably averted some hair-pulling bug in the future. Thanks again.

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 01, 2023 Aug 01, 2023
LATEST

@dsackett said:

"… used a loop that tested the last character of found items with myFoundItems.characters[-1].contents.charCodeAt() == 13 …"

"… Since special characters don't properly return a character code, the script errors out. …"

 

NOTE 1:

The following test for this case will work:

myFoundItems.characters[-1].contents == "\r";

 

NOTE 2:

In case character.contents is of kind SpecialCharacters.* ( enumerator )

https://www.indesignjs.de/extendscriptAPI/indesign-latest/#SpecialCharacters.html

try{
myFoundItems.characters[-1].contents.charCodeAt()
}catch(e)
{
alert( "ERROR: " + e.number +", "+ e.message )
}

 

charCodeAt() will return that error:

"ERROR:  24, myFoundItems.characters[-1].contents.charCodeAt is not a function"

 

This part:

charCodeAt is not a function

is a bit strange.

 

* There are other characters one could indeed call "special" like e.g. <FEFF> that have no enumeration where contents.charCodeAt() will return a value without any error. For <FEFF> this is 65279.

 

Regards,
Uwe Laubender
( Adobe Community Expert )

Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines