Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
  • 한국 커뮤니티
0

Regular expression

New Here ,
Aug 14, 2008 Aug 14, 2008

Following codes output differ why? I am searching for string "JA or Ja; or jA; o etc" But its find string when I use letter "x" in string. In first string I used letter "x" and in second line code I haven;t used letter "x" . Basically it is considering letter "x" wild matching

<cfset IndexOfOccurrence=REFind("[J|j;|&##x4A;|&##x6A;][A|a;|&##x41;|&##x61;]", "mamjsdjzdahggjxac;")>
<cfset IndexOfOccurrence3=REFind("[J|j;|&##x4A;|&##x6A;][A|a;|&##x41;|&##x61;]", "mamjsdjzdahggjqac;")>

I am expecting same output for above code? What change I should do to fix this?

Thanks


319
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 14, 2008 Aug 14, 2008
When you put chars within square brackets, it will search of ANY ONE of the
characters listed (ie: there's an explicit OR between each character. SO
for one thing, you don't need to have the pipe characters in there, as it's
implied. Also, regexes don't understand HTML codes, so it doesn't see
&#x4A; as a single code, it seems a sequence of six possible characters to
match.

So seen in that light, your regex really isn't matching what you think it
is.

Your regex is matching one of any of these:

Jj;&#x46A

Followed by any one of these:

Aa;&#x416

This would be better written:
[Jj;&#x46A][Aa;&#x416]

However this is not what you want, clearly.

Your regex matches the "jx" towards the end of your first string, and
there's no match in the second string.

Hence the different results.

--
Adam
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Aug 14, 2008 Aug 14, 2008
Thanks Adam . Thanks for catching 'x' letter

I want to treat &#x416; as single character. Basically I am converting VBSCRIPT to CFML. In vbscript we use regexp for pattern matching , I am finding difficulty to converting this pattern
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Aug 14, 2008 Aug 14, 2008
LATEST
> &#x416;

That is an HTML entity which will render as some sort of glyph on the
screen. Regexes work in characters and strings, not "things on the
screen". You probably want to get the character code of the glyph, and use
*that* in your regex. Read the CF regex docs for starters:

http://livedocs.adobe.com/coldfusion/8/htmldocs/regexp_01.html

And especially the bit about expressing non-standard characters via numeric
codes:

http://livedocs.adobe.com/coldfusion/8/regexp_08.html

--
Adam
Translate
Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources