Skip to main content
Loic.Aigon
Legend
June 8, 2016
Answered

"pure" JS regular expression for any unicode capital letter

  • June 8, 2016
  • 2 replies
  • 3018 views

Hi all,

Banging my head against walls here. Is there any chance a purely js regular expression would catch any unicode capital letter within a string ?

I wish I could avoid GREP F/C.

Thanks in advance for any hints,

Loic

This topic has been closed for replies.
Correct answer TᴀW

pixxxel schubser wrote:

But you should not work with long long strings.

Better use ranges of unicode like this

… contents.replace(/[A-Z]/g, replaceWithWhatever); // is the same as

… contents.replace(/[\u0041-\u005A]/g, replaceWithWhatever);

… contents.replace(/[A-ZÀ-Ö]/g, replaceWithWhatever); new range added - is the same as

… contents.replace(/[\u0041-\u005A\u00C0-\u00D6]/g, replaceWithWhatever);

and so on

Good point. So, here's a function that will convert any string to its shortened form for GREP:

function shortenString(s){

  var r1, r2, a, i;

  r1 = r2 = s[0];

  a = [];

  for (i = 1; i < s.length; i++){

    if (s.charCodeAt() == r2.charCodeAt() + 1){

      r2 = s;

      if (i != s.length - 1) continue;

    }

    if (r1 != r2){

      a.push(r1 + "-" + r2);

    }

    else {

      a.push(r1);

    }

    r1 = r2 = s;

  }

  return a.join("");

}

Ariel


I've updated the link to include the shortened version as well using the above function:

http://www.id-extras.com/uploads/AllUnicodeCapitals.html

Ariel

2 replies

pixxxelschubser
Community Expert
Community Expert
June 8, 2016

With Grep you can use

\p{uppercase_letter}

or the short form

\p{Lu}

But it doesn't seems to work with Javascript.

Loic.Aigon
Legend
June 8, 2016

Hi guys,

Thanks a lot @Ariel. I found another solution in the meanwhile but I can't say if your pattern is wider or thinner than this one from stackoverflow:

javascript - JS : Test if string contains any unicode capital - Stack Overflow

TᴀW
Legend
June 8, 2016
I also can't say, but my string includes *everything* that is included in the InDesign posix expression [[:upper:]]
Ariel
TᴀW
Legend
June 8, 2016

Perhaps:

http://www.id-extras.com/uploads/AllUnicodeCapitals.html

:-)

Ariel

PS The editor couldn't take the long string, hence the URL!