Copy link to clipboard
Copied
When multiple parentheses are nested, are the groups also nested?
This regular expression should only have $1$2. I tried it and it has $3. Isn't $3 empty, or is it an error?
Sample file, I want to search all the numbers
1. Main events
1.2 Time of occurrence
March 25th
1.2.3. Operational steps
Replace with :$3
Found that $3 even exists?
Hi @dublove:
You have three groups. Your first group is the position locator: beginning of paragraph. Not sure why you are grouping it. InDesign will add a space for $1 if you leave it grouped. And then when nested groups, the encompassing group is counted first, and the nested one second.
I want to search all the numbers
To simply match the numbers at the beginning of a paragraph to replace with something else here is one example. I suspect there are more efficient queries but I seem to be
...Can you explain what you are trying to do here?
Your expression can be simplified to ^(\d+\.)+ so I'm not clear what you are looking to change.
@dublove There is $3 because—as @Barb Binder said—your grep has three capture groups. Your string opens parentheses 3 times. You can "turn off" a capture group by starting it with (?: which creates a non-capture group.
Copy link to clipboard
Copied
Hi @dublove:
You have three groups. Your first group is the position locator: beginning of paragraph. Not sure why you are grouping it. InDesign will add a space for $1 if you leave it grouped. And then when nested groups, the encompassing group is counted first, and the nested one second.
I want to search all the numbers
To simply match the numbers at the beginning of a paragraph to replace with something else here is one example. I suspect there are more efficient queries but I seem to be the only one here for the moment.
^(\d+)\.(\d+)?\.?(\d+)?\.?
~Barb
Copy link to clipboard
Copied
Is yours more efficient than mine?
I think mine is faster.
It's just that (\d+\...) It's just a repetition.
There's no need to write so much.
What I'm asking is why there's a $3?
Copy link to clipboard
Copied
@dublove There is $3 because—as @Barb Binder said—your grep has three capture groups. Your string opens parentheses 3 times. You can "turn off" a capture group by starting it with (?: which creates a non-capture group.
Copy link to clipboard
Copied
This explanation makes more sense to me.
I. Capture Groups
Capture groups can be numbered by counting their opening brackets from left to right . For example, in the expression (A)(B(C)), there are four such groups:
Copy link to clipboard
Copied
Okay, good. That seems like the same explanation to me, but I'm glad you got it. For anyone wondering why there are 4 capture groups when there are only 3 parentheses opened, it is because there is always an "everything" capture group—$0.
P.S. A note on technical language: in everyday English we say "brackets" for ( ) , but in technical discussions we say parentheses for ( ) , brackets for [ ] , braces for { }.
Copy link to clipboard
Copied
Can you explain what you are trying to do here?
Your expression can be simplified to ^(\d+\.)+ so I'm not clear what you are looking to change.
Copy link to clipboard
Copied
Just have questions and don't do anything.
Copy link to clipboard
Copied
Peter showed you a nice grep pattern to use. It will find any number of combinations of numerals+period starting at the beginning of the line. In your example it will match
1.
1.2
1.2.3.
Did you try it?
Copy link to clipboard
Copied
Actually, it will not match 1.2 (and neither does @dublove 's original version), only the 1. portion.
I just removed the parentheses that were not required in the original query.
Copy link to clipboard
Copied
True, but honestly I assumed that was a dubluve's typo—I mean, why would one numbering system have the period and another not? I guess to do a manual search across messy numbered text.
Anyway, to match exactly what dublove asked for, the period should be optional:
^(\d+\.?)+
@dublove does this work for all your cases?
Copy link to clipboard
Copied
That's one of the reason's I asked about the intent of the search, along with why you would want to use $3 in the change filed here.
Copy link to clipboard
Copied
Well it's actually catching the 1. in the search too - the search is convoluted and takes more time than finding what you need to find.
The the find and replace GREP and the replace with $3 is a bit of unnecessary matching and presumably more time consuming.
Copy link to clipboard
Copied
My impression from what I think was @dublove 's response to my initial question, "Just have questions and don't do anything." is that the intent here is just understand how to find any dotted number sequences at the start of a paragraph, and not to do any sort of repalcement of those numbers -- just an exploration of technique. One use case would be to apply a character style.
As noted, the expression, both original and simplified, finds only the portions of number sequences that end in the dot. To find those that have a final digit or digits but no terminating dot along with those that do the expression should be modified to ^(\d+\.?)+
Copy link to clipboard
Copied
It's just a query and doesn't need to address the actual issue.
Copy link to clipboard
Copied
You should seriously consider getting a copy of @Peter Kahrel 's guide to GREP in InDesign
https://www.amazon.com/GREP-InDesign-InDesignSecrets-Peter-Kahrel/dp/0982508387
Copy link to clipboard
Copied
That's a great idea, thanks for the recommendation.
Copy link to clipboard
Copied
Because your have three sets of parentheses and therefore three groups.
And Peter's query is so much more efficient. There's a reason I normally sit out the GREP query questions. LOL
~Barb
Copy link to clipboard
Copied
You're right to question how many capturing groups are created. In regular expressions, every pair of parentheses () creates a capturing group, and they're numbered by the order of their opening parenthesis, not by nesting.
Let's break it down
(^)((\d+\.)+)
(^) Group 1: matches the start of the line (not really capturing anything useful)
((\d+\.)+) Group 2: matches one or more digit-dot sequences like 1.2.3. or 1.
Inside that:
(\d+\.) → Group 3: matches just one digit-dot, like 1. or 2.
So yes, $3 does exist, and it represents the last match of the inner (\d+\.) pattern.
That’s why when you do a find and replace using $3, it replaces with the final \d+. part inside the repeated group. Even if it's repeated several times (like 1.2.3.), $3 will only capture the last match in this case, 3..
So in your test case:
1.2.3. Operational steps
$1 = start of line (empty string)
$2 = 1.2.3.
$3 = 3.
It's a bit messy and slow - because of non-important matches and slow because it needs to replace the found with the change to $3
We can use what we were talking about recently - when you were unsure about the difference between
\> and \b
Here's a perfect case to find these things - and you don't need a Change To
I went one step further - but if it's not what you need we can cut it back but it's just an example and proof of concept.
^(\d\.)+(\d\.)\b
To highlight what's being found - I put in pink for show and tell
With nothing in Change To and hit change all
Again - it's just a demonstration.
Copy link to clipboard
Copied
It's too deep.
Regular expressions are a quagmire, it's good to be able to use them, not to go deep into them.
Copy link to clipboard
Copied
Understanding GREPs lets you build better more responsive GREPs.
Your GREP has things in it that are not required
(^)((\d+\.)+)
change to
$3
Could be simplified to
^((\d+\.)+)
$2
for example
Find more inspiration, events, and resources on the new Adobe Community
Explore Now