(^)((\d+\.)+) Isn't there one bracket for one group, and does $3 really exist?

Informe · May 24, 2025

When multiple parentheses are nested, are the groups also nested?
This regular expression should only have $1$2. I tried it and it has $3. Isn't $3 empty, or is it an error?

Sample file, I want to search all the numbers

1. Main events
1.2 Time of occurrence
March 25th
1.2.3. Operational steps

Replace with :$3
Found that $3 even exists?

Informe · May 24, 2025

Hi @dublove:

You have three groups. Your first group is the position locator: beginning of paragraph. Not sure why you are grouping it. InDesign will add a space for $1 if you leave it grouped. And then when nested groups, the encompassing group is counted first, and the nested one second.

I want to search all the numbers

To simply match the numbers at the beginning of a paragraph to replace with something else here is one example. I suspect there are more efficient queries but I seem to be the only one here for the moment.

^(\d+)\.(\d+)?\.?(\d+)?\.?

~Barb

Informe · May 24, 2025

Is yours more efficient than mine?
I think mine is faster.
It's just that (\d+\...) It's just a repetition.
There's no need to write so much.

What I'm asking is why there's a $3?

Informe · May 24, 2025

@dublove There is $3 because—as @Barb Binder said—your grep has three capture groups. Your string opens parentheses 3 times. You can "turn off" a capture group by starting it with (?: which creates a non-capture group.

Informe · May 24, 2025

This explanation makes more sense to me.

I. Capture Groups
Capture groups can be numbered by counting their opening brackets from left to right . For example, in the expression (A)(B(C)), there are four such groups:

Informe · May 24, 2025

Okay, good. That seems like the same explanation to me, but I'm glad you got it. For anyone wondering why there are 4 capture groups when there are only 3 parentheses opened, it is because there is always an "everything" capture group—$0.

P.S. A note on technical language: in everyday English we say "brackets" for ( ) , but in technical discussions we say parentheses for ( ) , brackets for [ ] , braces for { }.

Informe · May 24, 2025

Can you explain what you are trying to do here?

Your expression can be simplified to ^(\d+\.)+ so I'm not clear what you are looking to change.

Informe · May 24, 2025

Just have questions and don't do anything.

Informe · May 24, 2025

Peter showed you a nice grep pattern to use. It will find any number of combinations of numerals+period starting at the beginning of the line. In your example it will match

1.
1.2

1.2.3.

Did you try it?

Informe · May 25, 2025

Actually, it will not match 1.2 (and neither does @dublove 's original version), only the 1. portion.

I just removed the parentheses that were not required in the original query.

Informe · May 25, 2025

True, but honestly I assumed that was a dubluve's typo—I mean, why would one numbering system have the period and another not? I guess to do a manual search across messy numbered text.

Anyway, to match exactly what dublove asked for, the period should be optional:

^(\d+\.?)+

@dublove does this work for all your cases?

Informe · May 25, 2025

That's one of the reason's I asked about the intent of the search, along with why you would want to use $3 in the change filed here.

Informe · May 25, 2025

Well it's actually catching the 1. in the search too - the search is convoluted and takes more time than finding what you need to find.

The the find and replace GREP and the replace with $3 is a bit of unnecessary matching and presumably more time consuming.

Informe · May 25, 2025

My impression from what I think was @dublove 's response to my initial question, "Just have questions and don't do anything." is that the intent here is just understand how to find any dotted number sequences at the start of a paragraph, and not to do any sort of repalcement of those numbers -- just an exploration of technique. One use case would be to apply a character style.

As noted, the expression, both original and simplified, finds only the portions of number sequences that end in the dot. To find those that have a final digit or digits but no terminating dot along with those that do the expression should be modified to ^(\d+\.?)+

Informe · May 25, 2025

It's just a query and doesn't need to address the actual issue.

Informe · May 25, 2025

You should seriously consider getting a copy of @Peter Kahrel 's guide to GREP in InDesign

https://www.amazon.com/GREP-InDesign-InDesignSecrets-Peter-Kahrel/dp/0982508387

Informe · May 25, 2025

That's a great idea, thanks for the recommendation.

Informe · May 24, 2025

Because your have three sets of parentheses and therefore three groups.

And Peter's query is so much more efficient. There's a reason I normally sit out the GREP query questions. LOL

~Barb

Informe · May 24, 2025

You're right to question how many capturing groups are created. In regular expressions, every pair of parentheses () creates a capturing group, and they're numbered by the order of their opening parenthesis, not by nesting.

Let's break it down

(^)((\d+\.)+)

(^) Group 1: matches the start of the line (not really capturing anything useful)

((\d+\.)+) Group 2: matches one or more digit-dot sequences like 1.2.3. or 1.

Inside that:

(\d+\.) → Group 3: matches just one digit-dot, like 1. or 2.

So yes, $3 does exist, and it represents the last match of the inner (\d+\.) pattern.

That’s why when you do a find and replace using $3, it replaces with the final \d+. part inside the repeated group. Even if it's repeated several times (like 1.2.3.), $3 will only capture the last match in this case, 3..

So in your test case:
1.2.3. Operational steps

$1 = start of line (empty string)

$2 = 1.2.3.

$3 = 3.

It's a bit messy and slow - because of non-important matches and slow because it needs to replace the found with the change to $3

We can use what we were talking about recently - when you were unsure about the difference between

\> and \b

Here's a perfect case to find these things - and you don't need a Change To

I went one step further - but if it's not what you need we can cut it back but it's just an example and proof of concept.

^(\d\.)+(\d\.)\b

To highlight what's being found - I put in pink for show and tell

With nothing in Change To and hit change all

Again - it's just a demonstration.

Informe · May 24, 2025

It's too deep.

Regular expressions are a quagmire, it's good to be able to use them, not to go deep into them.

Informe · May 24, 2025

Understanding GREPs lets you build better more responsive GREPs.

Your GREP has things in it that are not required

(^)((\d+\.)+)

change to
$3

Could be simplified to

^((\d+\.)+)
$2

for example

(^)((\d+\.)+) Isn't there one bracket for one group, and does $3 really exist?

3 respuestas correctas