• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers

Indesign GREP code

Contributor ,
Oct 15, 2022 Oct 15, 2022

Copy link to clipboard

Copied

This is the information I have in hand:

==== Sample Data

Heading: This is heading one and contains info about blah blah. Heading 2: This is heading number two which talks about blah blah. Heading 3: Yet another heading. Heading 4: One more heading for so and so. Heading: Again one heading. Heading infinity: The is never ending heading.

==== Sample Data

 

This is the grep code I m using:

\w+(?:\s\w+)*\s?:

 

The code is working fine, however, I am not able to understand the code. The code was generated by me using help from someone. Can any one help explaining? The code select all the headings (be it single word or multiple words) including ":". Whats confusing me is the non capturing group with \w+ within paranthesis.

Any alternative code would also be appreciated.

Thanks

TOPICS
How to

Views

90

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Oct 16, 2022 Oct 16, 2022

\w+ means match one or more word characters. Word characters are the digits 0-9, the underscore character, and letters.

(?:\s\w+)*  match any (i.e. zero or more) instances of a space followed by one or more word characters. The grouping is applied so that the operator * applies to \s\w+. The ?: isn't strictly necessary, it makes the GREP expression more efficient. Grouping using parentheses forces InDesign to create a referent (so that you can refer to it later) but that's an effort. With ?: you

...

Likes

Translate

Translate
Community Expert ,
Oct 16, 2022 Oct 16, 2022

Copy link to clipboard

Copied

The non capturing group seems to help in selecting the heading that are split over multiple lines via a hard/soft return.

-Manan

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Oct 16, 2022 Oct 16, 2022

Copy link to clipboard

Copied

Hi Manan

Thanks for the reply.

The \s after ?: is non capturing or both \s\w+ are non capturing ? And why an * outside the paranthesis ?

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 16, 2022 Oct 16, 2022

Copy link to clipboard

Copied

You already have had some great explanations, I would like to emphasise on something that might create confusions and that is just because of the content we are searching

  • ?: inside a () makes the group as non capturing. As Peter said this is an optimisation and it would work with a capturing group as well
  • Anything preceeding a ? means that it is optional i.e. it may or may not be present. So \s? would match if a space character is present or not.
  • The last ?: is not to be confused with the ?: inside the group as in this instance ? is related to to \s preceding it and not :
  • The *, +, ? etc are quantifiers in regex and apply to the things preceding the quantifiers. For more details on how quantifiers work see the following article

http://www.rexegg.com/regex-quantifiers.html

I hope this clears out any confusions that you might still had

P.S. :- Do note that somethings you find on the internet might not work in InDesign. It all depends upon the regex engine implemented by InDesign. However, the basics remain the same and function more or less consistently

-Manan

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Oct 17, 2022 Oct 17, 2022

Copy link to clipboard

Copied

LATEST

Hi Manan 

Thank you so much for the explanation.

Peter's explanation made it clear. 

Thanks once again.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 16, 2022 Oct 16, 2022

Copy link to clipboard

Copied

@shahidr100 

There are usually many different ways to get where you want to go with Grep. Are you only interested in understanding this Grep? Or does it not find all or too many occurrences?


One possibility would be (if you have no more colons in the headings)

[^:]*:

 

Finds everything up to the colon and additionally the colon.

 

Perhaps the nested formats in the paragraph style would also help you. But for that you would have to explain in more detail what exactly you want to achieve.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Oct 17, 2022 Oct 17, 2022

Copy link to clipboard

Copied

Thanks for your help pixxxelschubser

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Oct 16, 2022 Oct 16, 2022

Copy link to clipboard

Copied

\w+ means match one or more word characters. Word characters are the digits 0-9, the underscore character, and letters.

(?:\s\w+)*  match any (i.e. zero or more) instances of a space followed by one or more word characters. The grouping is applied so that the operator * applies to \s\w+. The ?: isn't strictly necessary, it makes the GREP expression more efficient. Grouping using parentheses forces InDesign to create a referent (so that you can refer to it later) but that's an effort. With ?: you prevent the creation of the referent.

\s?: One or zero spaces and a colon.

 

Peter

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Contributor ,
Oct 17, 2022 Oct 17, 2022

Copy link to clipboard

Copied

Thank you so much Peter.

Your explanation made it crystal clear. Actually ?: inside the group was confusing me..

Thanks a lot for your help.

Likes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines