• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Regular expressions unpredictable and weird

Contributor ,
Apr 02, 2016 Apr 02, 2016

Copy link to clipboard

Copied

I am really struggling with running regular expressions in Adobe Scripts. They don't seem to have work as they should. I am constantly having to break down complex expressions to simple ones to find things that appear to be bugs in the syntax.

For example:

var pattern = /(A(.+)C)*/g;

var string = "ABBC";

$.writeln(pattern.exec(string));

I should get the result:

"ABBC",

"ABBC",

"BB"

but instead I get:

undefined,

undefined,

"BBC"

What is going on here?

TOPICS
Scripting

Views

898

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 02, 2016 Apr 02, 2016

Copy link to clipboard

Copied

It seems to be wrong syntax.

Try one of these instead:

var pattern = /(A(B+)C)*/g;  // or

var pattern = /(A([^C]+)C)*/g;   // or

var pattern = /(A(.+)C)*$/g;

The reason is:

"C" will be found in your own regex always with (.+) and is already part of this.

Have fun

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 03, 2016 Apr 03, 2016

Copy link to clipboard

Copied

Why not?

var pattern = /(A(.+)C)/g

even if I don't really understand what you want to do! 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Apr 04, 2016 Apr 04, 2016

Copy link to clipboard

Copied

Hi McShaman,

Don't expect ExtendScript to behave as regular JavaScript with regular expressions. There are several bugs in that field and, in particular, in CS6 and later when it comes to greedy quantifiers like +, * or {m,n} used in conjunction with sub-patterns. More details here: Indiscripts :: InDesign Scripting Forum Roundup #7

@+

Marc

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Apr 04, 2016 Apr 04, 2016

Copy link to clipboard

Copied

Thanks... Yeah I thought so. Been finding a lot of inconsistencies between ExtendScript regex and how my browser would behave.

Whey you say ExtendScript has problems... Does that mean if I run the script from within the Application (e.g. InDesign) it should work?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 04, 2016 Apr 04, 2016

Copy link to clipboard

Copied

McShaman‌,

please read every post again - attentively.

And then try this:

var str = "ABBC";

var iOarr = /\*\$?\//;

var x = null;

var pattern = [ "(A(.*)C)*", "(A(B+)C)*", "(A([^C]+)C)*", "(A(.+)C)*$", "(A(.*)C)" ];

for (i=pattern.length; i>0; i--) {

    var pat = pattern.shift();

    var pat = new RegExp (pat);

  

    if(pat.toString().match (iOarr) == null) {

        x = "without asterisk";

        } else {

        x = "with asterisk";

        };

$.writeln(x+", " +pat + ", " + pat.exec(str));

}

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Apr 06, 2016 Apr 06, 2016

Copy link to clipboard

Copied

LATEST

Yeah I appreciate the alternatives. The example I provided was not one that I am using directly in my code. It is a simplified example to ask the forum question that was about the apparent differences in Regex results in ExtendScript vs browser or Node

But thank you for your advise anyway. Some good alternatives there that I will keep in my pocket.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Apr 04, 2016 Apr 04, 2016

Copy link to clipboard

Copied

The star in your pattern is causing the problems here.

Remember, '*' means zero or more. Now usually GREP is greedy, and a '*' should do 'no harm' when used this way. (Indeed, regex101.com does the correct thing.) But it seems something is broken in InDesign.

That said .. I actually see no reason to have that * in your expression! Since it's applied to the entire group that you are searching for, it essentially means zero or more of the entire group. That may be (wrongly, though) why the first items it finds are empty.

When I remove the * at the end I get what you expect:

ABBC  (entire match)

ABBC (group #1)

BB (group #2)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines