Copy link to clipboard
Copied
I am really struggling with running regular expressions in Adobe Scripts. They don't seem to have work as they should. I am constantly having to break down complex expressions to simple ones to find things that appear to be bugs in the syntax.
For example:
var pattern = /(A(.+)C)*/g;
var string = "ABBC";
$.writeln(pattern.exec(string));
I should get the result:
"ABBC",
"ABBC",
"BB"
but instead I get:
undefined,
undefined,
"BBC"
What is going on here?
Copy link to clipboard
Copied
It seems to be wrong syntax.
Try one of these instead:
var pattern = /(A(B+)C)*/g; // or
var pattern = /(A([^C]+)C)*/g; // or
var pattern = /(A(.+)C)*$/g;
The reason is:
"C" will be found in your own regex always with (.+) and is already part of this.
Have fun
Copy link to clipboard
Copied
Why not?
var pattern = /(A(.+)C)/g
even if I don't really understand what you want to do!
Copy link to clipboard
Copied
Hi McShaman,
Don't expect ExtendScript to behave as regular JavaScript with regular expressions. There are several bugs in that field and, in particular, in CS6 and later when it comes to greedy quantifiers like +, * or {m,n} used in conjunction with sub-patterns. More details here: Indiscripts :: InDesign Scripting Forum Roundup #7
@+
Marc
Copy link to clipboard
Copied
Thanks... Yeah I thought so. Been finding a lot of inconsistencies between ExtendScript regex and how my browser would behave.
Whey you say ExtendScript has problems... Does that mean if I run the script from within the Application (e.g. InDesign) it should work?
Copy link to clipboard
Copied
McShaman,
please read every post again - attentively.
And then try this:
var str = "ABBC";
var iOarr = /\*\$?\//;
var x = null;
var pattern = [ "(A(.*)C)*", "(A(B+)C)*", "(A([^C]+)C)*", "(A(.+)C)*$", "(A(.*)C)" ];
for (i=pattern.length; i>0; i--) {
var pat = pattern.shift();
var pat = new RegExp (pat);
if(pat.toString().match (iOarr) == null) {
x = "without asterisk";
} else {
x = "with asterisk";
};
$.writeln(x+", " +pat + ", " + pat.exec(str));
}
Copy link to clipboard
Copied
Yeah I appreciate the alternatives. The example I provided was not one that I am using directly in my code. It is a simplified example to ask the forum question that was about the apparent differences in Regex results in ExtendScript vs browser or Node
But thank you for your advise anyway. Some good alternatives there that I will keep in my pocket.
Copy link to clipboard
Copied
The star in your pattern is causing the problems here.
Remember, '*' means zero or more. Now usually GREP is greedy, and a '*' should do 'no harm' when used this way. (Indeed, regex101.com does the correct thing.) But it seems something is broken in InDesign.
That said .. I actually see no reason to have that * in your expression! Since it's applied to the entire group that you are searching for, it essentially means zero or more of the entire group. That may be (wrongly, though) why the first items it finds are empty.
When I remove the * at the end I get what you expect:
ABBC (entire match)
ABBC (group #1)
BB (group #2)