GREP to find words before or after > symbol

Report · Jan 16, 2017

Hi guys,

I wonder if you can give me a little push in the right direction with this GREP code.

I want to find words which come before/after a space and a greater than symbol.

(?<=\s>)?\w+(?=\s>)

This is what I have but it doesn't find words which come after the symbol.

Menu > Menu > Menu

Would only find the first 2 Menus.

Thanks everyone

Report · Jan 16, 2017

Hi Jakec,

Basically a GREP like

\w([^\r>]+\w)*

would do a very good job (including when the menu item contains inner space), since it captures every string that starts and ends with a word character while not containing any >.

Now, what you want is to make sure that the pattern above is either preceded by >\s, or followed by \s>, in order to exclude lines that do not contain the X > Y > … > Z structure at all. So in fact you have an alternation of two distinct regex, first based on a lookbehind, second based on a lookahead:

The lookbehind form is:

(?<=>\s)\w([^\r>]+\w)*

The lookahead form is:

\w([^\r>]+\w)*(?=\s>)

So your final GREP is:

((?<=>\s)\w([^\r>]+\w)*)|(\w([^\r>]+\w)*(?=\s>))

@+

Marc

Report · Jan 16, 2017

Hi Marc,

Thanks for your reply, however your GREP seems to miss out the first word in the string.

Menu > Menu > Menu

Report · Jan 16, 2017

(?<=\s>\s)?[^\s>]\w+(?=\s>\s)?

This find all of them but because I use the ? it find all other paragraphs too

Report · Jan 16, 2017

Hi,

Do you want to find Soft Menus and Submenus in text, presented as … > … >… > …?

… as:

(^/)

Report · Jan 16, 2017

Hi Obi,

I'm trying to select only the text between the space and > symbols.

I want to make the text bold.

Report · Jan 16, 2017

Could you make a screenshot of real texts?

(^/)

Report · Jan 16, 2017

So I want to change text like this to bold (not including the >)

Report · Jan 16, 2017

Here, specifically:

\u\l+(?=\h>)|(?<=>\h)\u\l+

(^/)

Report · Jan 16, 2017

> however your GREP seems to miss out the first word in the string (…)

Works for me though, even as GREP-style:

Make sure you have properly encoded the regex. (Keep in mind that backslashes are needed in JS strings for escaping purposes.)

@+

Marc

Report · Jan 16, 2017

Hi Marc,
I also tested your GREP with a GREP Style.

No problem with InDesign CC 2017.

Regards,
Uwe

Report · Jan 16, 2017

Hi Marc,

Very sorry, I must have missed something when I copied your code.

It does work That's great thanks for your help.

Report · Jan 16, 2017

Marc, I'm writing an article on the use of GREP in InDesign, do you mind if I use this expression as an example?

I'm happy to reference your website or this forum

Report · Jan 16, 2017

You're welcome, Jakec. Feel free to share 😉

@+

Marc

Report · Jan 16, 2017

And now a short snippet to test my GREP as well:

// FindChange Settings
// ---
var FIND_WHAT = "((?<=>\\s)\\w([^\\r>]+\\w)*)|(\\w([^\\r>]+\\w)*(?=\\s>))";
var CHANGE_FONT = "Arial\tBold";
app.findGrepPreferences = app.changeGrepPreferences = null;
app.findGrepPreferences.findWhat = FIND_WHAT;
app.changeGrepPreferences.appliedFont = CHANGE_FONT;
// Assuming a TextFrame is selected.
// ---
var target = app.selection[0].parentStory.paragraphs.everyItem();
target.changeGrep();

@+

Marc

Report · Jan 16, 2017

Hi Marc,

Just for comment!

Doable if first line case, not in the other lines!

I'ld use that code to get this:

Grep code:

\u[\l\d\h-]+(?=\h>)|(?<=>\h)\u[\l\d\h-]+[…~j]?

or

(\u[\l\d\h-]+)(?=\h>)|(?<=>\h)(?1)[…~j]?

(^/)

Report · Jan 16, 2017

Obi,

I made no assumption about what characters are supposed to form a Menu string since it wasn't specified in the original question. My solution is just about splitting a string along a " > " separator, as originally requested. Now of course many refinements could be proposed, depending on what the user actually wants to do, in what context, what Unicode range, and so on.

Marc

Report · Jan 16, 2017

Seems to be more complete:

(\u[^\u…~j]+[…~j]?)(?=\h>)|(?<=>\h)(?1)

Works taking in account only one beginning cap.!

(^/)

Report · Mar 23, 2017

Hi, I'm looking for something similar. I need a grep code that will bold text (of the same character style) after the second space. Another way to do it is to bold any text that comes after the Article #.

Article 2 District Standards

Article 3 Building Standards

Thanks!

Report · Mar 23, 2017

^Article \d+\K.+

(\K stands for 'match but do not capture, it's a variable-length lookbehind.)

Peter

Report · Mar 23, 2017

Thank you so much!!

Report · Mar 24, 2017

Follow-up question. How would I adjust the greb to allow for the condition below. Thanks!

Article 3 Building Standards // Section 3.A General Building Standards

Report · Mar 24, 2017

If you mean 'Article 3' OR 'Article 3.A' ( where 3.A could be 3.B, 3C, etc)., then try this:

^Article \d\d?(\.[A-Z])?\K.+

that is, Paragraph-initial 'Article'

followed by a space and one or two digits

followed optionally by a dot and a captital

P.

Report · Mar 27, 2017

So, this didn't seem to work. The following is what I'm trying to accomplish. The first code from above (^Article \d+\K.+) works on the text prior to the "//". But the other code (^Article \d\d?(\.[A-Z])?\K.+) isn't resulting in no-bold and bold text, as desired. What would be the compound formula for this entire series of text? Thanks so much.

Article 2 Character District Standards // Section 2.A General District Standards

Article 3 Building Standards // Section 3.A General Building Standards

Article 3 Building Standards // Section 3.B Primary Building Types

Report · Mar 27, 2017

Knowing you use double spaces before/after "//":

(Article|Section)\h\d+(\.[A-Z])?\h\K[^/\r]+(?=\h\h//|$)

(^/)

GREP to find words before or after > symbol

1 Correct answer