• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Using GREP to mark everything that isn't a specific word

Explorer ,
Aug 08, 2019 Aug 08, 2019

Copy link to clipboard

Copied

I'd like to use a GREP style to mark everything, except certain words. So for example:

This is ignore an example ignore text

should be marked like so:

This is ignore an example ignore text

Normally, I'd just have two rules and mark the words I don't want included, but the style I want to apply hides the selected words by setting their size to 0.1 - which can't be reset by another style (the new style would force its size upon it, instead of resetting it).

This is as far as I have come, and this already took me a day to figure out:

.*(?<=[^ignore]).*

It works if the word to ignore is not surrounded by anything. So if the text is just "ignore" it will stay, everything else will be marked. Any help would be appreciated!

Views

3.5K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Expert , Aug 10, 2019 Aug 10, 2019

You want to at least mark everything that is "not a word":

\W+

(where the '+' is in the hope that this is more efficient than a single \W, which would act upon each not-a-word character one at a time).

You also want to mark every entire word ...

\W+|\b\w+

– and now everything should be marked; all not-a-word characters OR all word characters. But I'm not done yet.

"After" the "\b" (word break) you are always at the start of a word. At that point you can insert a negative lookahead to exclude the words

...

Votes

Translate

Translate
Community Expert ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

You need to do this in two steps: first mark everything, then unmark the words to ignore. The character style to mark everything is applied to ^.+ and marks whole paragraphs. Then the character style to undo the marking is applied to \b(ignore|this|and|that)\b

P.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

Thanks for your input, Peter!

Unfortunately, as I tried to explain in my post, I can't do two steps. I can't create a style to undo the marking, since the marking style changes too many properties that are text specific.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

How about applying a condition?

P.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

> How about applying a condition?

You can't set a condition in a character style.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

Sorry, completely missed what you said about not being able to use a two-stage approach -- it's very clearly there!. The problem is that with grep you can look for characters that are not a particular letter, but you can't look for strings that are not a particular word. Your [^ignore] doesn't skip the word 'ignore', instead it ignores everything that is not 'i' and not 'g' and not 'n', etc. That how character classes ([. . .]) work. Your grep simply matches whole paragraphs. In plain English it says 'match zero or more characters up to the first character that isn't i, g, n, etc, then match zero or more characters.

You'll have to use a two-step approach, and if you can't use grep styles it'll have to be a script, something like this:

ignore = '\\b(ignore|this|and|that)\\b';

app.findGrepPreferences = app.changeGrepPreferences = app.findChangeGrepOptions = null;

app.findGrepPreferences.findWhat = ignore;

app.changeGrepPreferences.underline = true;

app.activeDocument.changeGrep();

app.findGrepPreferences = app.changeGrepPreferences = null;

app.findGrepPreferences.underline = false;

app.changeGrepPreferences.appliedCharacterStyle = app.activeDocument.characterStyles.item ('mark');

app.activeDocument.changeGrep();

app.findGrepPreferences = app.changeGrepPreferences = null;

app.findGrepPreferences.findWhat = ignore;

app.changeGrepPreferences.underline = false;

app.activeDocument.changeGrep();

In line 11 you use the name of your character style.

The script uses underline as a temporary marker, but if you use underline somewhere, use a different temporary marker, such as strikethrough, a colour, anything that's not used in the text.

P.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

Thanks again! Also for clarifying that my solution basically worked by pure chance (none of the words I tested begin with a character from the ignored word).

I have no experience with scripts. I'm using the GREP style for a merge. Can I set it up in a way that it will work automatically in merge previews and the final merge? Or do I have to trigger the script manually every time?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

You'd have to run the script every time.

P.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

Hm. Running the script manually every time defeats the point

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 09, 2019 Aug 09, 2019

Copy link to clipboard

Copied

Can’t you just omit the word you’re trying to (ignore) from the first style? Seems backwards to mark an entire paragraph with a character style. Just update the paragraph style. I guess I don’t understand what your end goal is.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

I have different texts / paragraphs with different font sizes, styles, etc. I want to hide everything in them, except for specific words (one specific one, for starters). I hide stuff with a style that makes the text transparent and the font size 0.1. But there doesn't seem to be a way to revert that with a character style, without overriding the font size it should be.

So I'd have to create a paragraph style and a character style for every place I want to use this that has a different look. I'd like to avoid that to keep things manageable. Only thing I can think of is a GREP that hides everything but the specific word(s).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

You want to at least mark everything that is "not a word":

\W+

(where the '+' is in the hope that this is more efficient than a single \W, which would act upon each not-a-word character one at a time).

You also want to mark every entire word ...

\W+|\b\w+

– and now everything should be marked; all not-a-word characters OR all word characters. But I'm not done yet.

"After" the "\b" (word break) you are always at the start of a word. At that point you can insert a negative lookahead to exclude the words you are looking for:

\W+|\b(?!(ignore|me)\b)\w+

leading to the result:

You can list all of your to not be ignored inside the inner parentheses, separated with an OR bar |.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

A touch of the Master.

Sometimes you may find useful to add to this excellent regex above a Case-insensitive On modifier (?i), say, to pick up a word at the beginning of a sentence.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

Pure magic! Thanks so much, not only for the solution but also the clear explanation.

I was afraid it's just not possible. Understanding your solution will open up a lot of possibilities. Excellent!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Aug 10, 2019 Aug 10, 2019

Copy link to clipboard

Copied

Nice one, Theunis!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Aug 12, 2019 Aug 12, 2019

Copy link to clipboard

Copied

LATEST

Okay, I do have a follow up on your solution. I played around with your GREP code a bit and realized my mental model is completely not matching what is going on.

How would you tackle the following situation?

This is ignore an example ignore me text

should be marked like so:

This is ignore an example ignore me text

"ignore me" is now one term where the space in between should not be marked... just writing (ignore|ignore\sme) obviously doesn't do the trick...

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines