Skip to main content
K.Daube
Community Expert
Community Expert
October 4, 2016
Answered

ES regex does not recognise umlauts etc...

  • October 4, 2016
  • 1 reply
  • 299 views

Friends of regular expressions!

I want to check for valid (special) variable names and use

\w

for word characters. This does not find characters such as ä - or any other none-ASCI character.

My test script is this:

var toTest = [], j, n;
    toTest.push("#Apple");                            // true
    toTest.push("#Apfel, #apple, #Gurkensalateska");  // true
    toTest.push("#t_27Apples");                       // true
    toTest.push("#côté");                             // false
    toTest.push("#Grüne_5Äpfel");                     // false
    toTest.push("#Äpfel");                            // false

n = toTest.length;
for (j=0; j < n; j++) {
  $.writeln(ContainsVariable (toTest), " ", toTest);
}

function ContainsVariable (text) {
var re_variable= /(^#[\w][\d_\w]*)(, +)?/,
    indexOfChar, character, lSkip, kText;

  for(indexOfChar = 0; character = text[indexOfChar]; indexOfChar++) {
    kText = text.substring(indexOfChar); // rest of the statement
    sFound = kText.match(re_variable);
    if (sFound != null) {
      lSkip = sFound[0].length;
      indexOfChar = indexOfChar + lSkip -1;
      continue;
    }
    return false;
  }
  return true;
}

Any ideas how to get valid results for non-english variable names?

There seem to be other shortcomings also: indesign-scripting-forum-roundup-7#hd4sb​3.

I have not found any documation about the regex-flavour of ES. Isn't there any?

Klaus

This topic has been closed for replies.
Correct answer K.Daube

Well, friends, this page gave me a hint:

var re_variable= /(^#[A-Za-z\u00C0-\u017E][\d_A-Za-z\u00C0-\u017E]*)(, +)?/,

works as intended.

Thanks for listening

Klaus

1 reply

K.Daube
Community Expert
K.DaubeCommunity ExpertAuthorCorrect answer
Community Expert
October 4, 2016

Well, friends, this page gave me a hint:

var re_variable= /(^#[A-Za-z\u00C0-\u017E][\d_A-Za-z\u00C0-\u017E]*)(, +)?/,

works as intended.

Thanks for listening

Klaus