Copy link to clipboard
Copied
Hi everyone
I needed a |GREP expression| to find any text between |symbols| and apply bold to it.
for example something to change the paragraph above in:
I needed a GREP expression to find any text between symbols and apply bold to it.
I found
\|(.+)\|
And it works well if the text between | symbols is in a single paragraph, but doesn't work if the text to be bolded contains more paragraphs.
For example, in te text:
Paragraph1.
|Paragraph2.
Paragraph3.|
Paragraph4.
The expression doesn't find
|Paragraph2.
Paragraph3.|
then I found a different expression that makes what I need.
\|([^?]+)\|
This expression finds text between | no matter how long and how many paragraphs it is.
Great! Problem solved...
But I just found it doesn't work if there's a question mark in the text.
Going back to the example above, it works on:
Paragraph1.
|Paragraph2.
Paragraph3.|
Paragraph4.
But it doesn't work on:
Paragraph1.
|Paragraph2?
Paragraph3.|
Paragraph4.
^? is supposed to find any character, I tested almost every possible symbol and punctuation. Just the ? stops the expression from working.
How can I solve this?
Possibly you came across this trick that accidentally worked for others, and so it does did for you
The thing is: sure, it works. But for other reasons than you might think! The purpose of the question mark is just to have *something* in the character class to negate. It works as well (or, possibly even better!) with another random character that should not appear anywhere in your text. Try it with alpha α, sha ш, or san ϻ and you'll see (unless you happen to use lots of text in Greek, Cyrillic,
...Copy link to clipboard
Copied
^\|(\r|.)+\|$
It won’t work as a GREP style, though. Find/Change only.
Copy link to clipboard
Copied
Thank you
It could work as a workaround, but it finds text only if | are at the beginning od at the end of a paragraph.
I was trying to make an expression that works no matter the | are.
Because the text to find could be an entire paragraph, a group of paragraphs or just a single word in the middle of a paragraph.
\|([^?]+)\|
makes that job, it only fails if a question mark is in the text.
I think I have to add something after ^? but I don't know what.
Copy link to clipboard
Copied
berardino antond90257327 wrote
it only fails if a question mark is in the text.
… because you're excluding question mark from found. Why? Try this:
\|([^|]+)\|
Copy link to clipboard
Copied
Possibly you came across this trick that accidentally worked for others, and so it does did for you
The thing is: sure, it works. But for other reasons than you might think! The purpose of the question mark is just to have *something* in the character class to negate. It works as well (or, possibly even better!) with another random character that should not appear anywhere in your text. Try it with alpha α, sha ш, or san ϻ and you'll see (unless you happen to use lots of text in Greek, Cyrillic, or ancient Greek).
The Question Mark trick "works" because it (ab)uses negation: the sequence [^?] matches any character except the "?" in the list -- and this, in turn, works because you need to match not only regular characters but also a Paragraph Return. A standard GREP expression cannot match the Return character -- it is the exception to the rule that ". matches anything" -- but negating it kind of circumvents the rule.
And it worked until you need to use the "?".
A better option is to use the toggle "(?s)" at the start. This allows the catch-all period to also match a return, and so it will find a match that spans multiple paragraphs:
(?s)\|.+?\|
By the way #1: see how I use .+? instead of a regular .+? (Um. That is "period plus question" instead of "period plus". Plus a question mark at the end because it was stated as a question, but that's not part of the GREP.) A regular period-plus matches as much as possible, and so it will match both the leftmost and rightmost "|" in "|what can|possibly|go wrong|", but I'm certain you'd want to match the in-between pipes as well, like this: "|what can|possibly|go wrong|". Adding a question mark after the "+" makes it match a shortest possible match, instead of a longest possible.
By the way #2: it is very likely that the random use of the question mark in your own GREP originated from this correct use, but only through half-understanding its purpose, after which it was shuffled around until suddenly the expression "magically" worked.
Until you needed it to match a literal question mark, of course.
Copy link to clipboard
Copied
Nice explanation
It recalls me a warning in one of the Linux forums I saw many years ago: "Before asking you should already know 3/4 of the answer, otherwise you may have a hard time to understand the answer you got".
Copy link to clipboard
Copied
You are totally right.
The problem was I totally misunderstood the meaning of ^?
Because in text search it means "Any Character" not "except ?".
In fact my first expression used . for "Any Character", but since it didn't work on more paragraphs I tried with [^?] since I thought it was equivalent.
It's obvious why question marks where a problem, now that I know the real function of ^?.
Thankyou very much to everyone.