Skip to main content
Participating Frequently
July 30, 2009
Answered

grep search for any # of words within quotes

  • July 30, 2009
  • 1 reply
  • 3503 views

It's my understanding (from what I've read) that the following will find a single word within quotation marks:

(?<=")\w+(?=")

Can anyone please explain how to modify this grep search in a way that allows me to find any number of words within a pair of quotation marks (whether straight or "curly")?

(Yes, I'm extremely new to this, but I've searched about a dozen sites, and can't seem to locate the answer.)

Many thanks in advance,

Ron Herrmann

This topic has been closed for replies.
Correct answer Peter Kahrel

(?<=").+?(?=")

1 reply

Peter Kahrel
Community Expert
Peter KahrelCommunity ExpertCorrect answer
Community Expert
July 30, 2009

(?<=").+?(?=")

Participating Frequently
July 31, 2009

A thousand thank you's to you!

BTW, I did read much of the scripting guides, and many of the posts on this site.

What I seem to have a tough time with, is finding the little details that change the grep search criteria in minor ways to accomplish the details.

Is this simply a "school of hard knocks" learning situation? Or can you recommend a reference source that more simply and accurately defines the various grep search symbols in layman's terms?

I believe I saw a post elsewhere that discusses the means of selecting the quotation marks also - will revisit it, and will have the final "recipe."

Again, thank you for the quick, pointed reply.

Ron Herrmann

Jongware
Community Expert
Community Expert
July 31, 2009

Grabbing the curly bits as well is just a bit easier

".+?"

Peter assumed you didn't need them, and therefore used the somewhat more expansive "lookbehind" (?<=x) and "lookahead" (?=x) stuff. The italic x mark the spot where you insert what you want to find but not mark in the selection (i.e., "(?<=not )mark(?= in)" has only one match in this sentence, and it will select just the word "mark").

When given straight quotes, GREP will automatically include all possible quotation marks, curly or straight. The stuff in the middle is . (period) -- Match Any Character and + -- one or more. This will match anything and everything in the paragraph up to the last quote, and thus it would also include any quotes inbetween! It's the reason this search form is called "greedy".

Therefore, Peter added yet another qualifier, the ? -- sorry, no short description possible ... It has lots of other uses. But in combination with ".+", it means "the shortest possible match", which in this case is anything up to the next quote (this is called the "non-greedy" form).