Skip to main content
Participant
September 27, 2011
Answered

How to specify a unicode range in a RegExp?

  • September 27, 2011
  • 2 replies
  • 2114 views

The problem is: check a single word if belongs to a given language character set. So I think I have to instantiate a RegExp with expression [\u4E00-\u9FFF] (for Chinese language) in order to test against the specified word. But it does not work? Is this a bug with unicode ranges?

This topic has been closed for replies.
Correct answer Andrei1-bKoviI

"to use \ you must escape it otherwise the \ is not understood.  \\u9FF"

According to documentation there is no need for escape. Also, Unicode chars work fine with no escape. The problem is the range used in character classes [ ].

For instance, this will work:

/[\u4EAA|\u4EAB]/

and this

/[\u4EAA]/

but not the range as it workes for regular characters.

2 replies

Participating Frequently
September 28, 2011

to use \ you must escape it otherwise the \ is not understood.  \\u9FF

Andrei1-bKoviICorrect answer
Inspiring
September 28, 2011

"to use \ you must escape it otherwise the \ is not understood.  \\u9FF"

According to documentation there is no need for escape. Also, Unicode chars work fine with no escape. The problem is the range used in character classes [ ].

For instance, this will work:

/[\u4EAA|\u4EAB]/

and this

/[\u4EAA]/

but not the range as it workes for regular characters.

Participating Frequently
September 28, 2011

Then i stand corrected.  thanks for the update

 

Ben Smith

ActionScript Technologist

@fezec | Member of Adobe's Community Professionals

Community Expert
September 27, 2011

I'd say yes, it is a bug. It has been logged in Adobe bug system too...

--

Kenneth Kawamoto

http://www.materiaprima.co.uk/