Skip to main content
Bedazzled532
Inspiring
October 16, 2022
Answered

Indesign GREP code

  • October 16, 2022
  • 3 replies
  • 884 views

This is the information I have in hand:

==== Sample Data

Heading: This is heading one and contains info about blah blah. Heading 2: This is heading number two which talks about blah blah. Heading 3: Yet another heading. Heading 4: One more heading for so and so. Heading: Again one heading. Heading infinity: The is never ending heading.

==== Sample Data

 

This is the grep code I m using:

\w+(?:\s\w+)*\s?:

 

The code is working fine, however, I am not able to understand the code. The code was generated by me using help from someone. Can any one help explaining? The code select all the headings (be it single word or multiple words) including ":". Whats confusing me is the non capturing group with \w+ within paranthesis.

Any alternative code would also be appreciated.

Thanks

This topic has been closed for replies.
Correct answer Peter Kahrel

\w+ means match one or more word characters. Word characters are the digits 0-9, the underscore character, and letters.

(?:\s\w+)*  match any (i.e. zero or more) instances of a space followed by one or more word characters. The grouping is applied so that the operator * applies to \s\w+. The ?: isn't strictly necessary, it makes the GREP expression more efficient. Grouping using parentheses forces InDesign to create a referent (so that you can refer to it later) but that's an effort. With ?: you prevent the creation of the referent.

\s?: One or zero spaces and a colon.

 

Peter

3 replies

Peter Kahrel
Community Expert
Peter KahrelCommunity ExpertCorrect answer
Community Expert
October 16, 2022

\w+ means match one or more word characters. Word characters are the digits 0-9, the underscore character, and letters.

(?:\s\w+)*  match any (i.e. zero or more) instances of a space followed by one or more word characters. The grouping is applied so that the operator * applies to \s\w+. The ?: isn't strictly necessary, it makes the GREP expression more efficient. Grouping using parentheses forces InDesign to create a referent (so that you can refer to it later) but that's an effort. With ?: you prevent the creation of the referent.

\s?: One or zero spaces and a colon.

 

Peter

Bedazzled532
Inspiring
October 17, 2022

Thank you so much Peter.

Your explanation made it crystal clear. Actually ?: inside the group was confusing me..

Thanks a lot for your help.

pixxxelschubser
Community Expert
Community Expert
October 16, 2022

@Bedazzled532 

There are usually many different ways to get where you want to go with Grep. Are you only interested in understanding this Grep? Or does it not find all or too many occurrences?


One possibility would be (if you have no more colons in the headings)

[^:]*:

 

Finds everything up to the colon and additionally the colon.

 

Perhaps the nested formats in the paragraph style would also help you. But for that you would have to explain in more detail what exactly you want to achieve.

Bedazzled532
Inspiring
October 17, 2022

Thanks for your help pixxxelschubser

Community Expert
October 16, 2022

The non capturing group seems to help in selecting the heading that are split over multiple lines via a hard/soft return.

-Manan

-Manan
Bedazzled532
Inspiring
October 16, 2022

Hi Manan

Thanks for the reply.

The \s after ?: is non capturing or both \s\w+ are non capturing ? And why an * outside the paranthesis ?

Community Expert
October 17, 2022

You already have had some great explanations, I would like to emphasise on something that might create confusions and that is just because of the content we are searching

  • ?: inside a () makes the group as non capturing. As Peter said this is an optimisation and it would work with a capturing group as well
  • Anything preceeding a ? means that it is optional i.e. it may or may not be present. So \s? would match if a space character is present or not.
  • The last ?: is not to be confused with the ?: inside the group as in this instance ? is related to to \s preceding it and not :
  • The *, +, ? etc are quantifiers in regex and apply to the things preceding the quantifiers. For more details on how quantifiers work see the following article

http://www.rexegg.com/regex-quantifiers.html

I hope this clears out any confusions that you might still had

P.S. :- Do note that somethings you find on the internet might not work in InDesign. It all depends upon the regex engine implemented by InDesign. However, the basics remain the same and function more or less consistently

-Manan

-Manan