Copy link to clipboard
Copied
I have the ususal problem. I am trying to remove carriage returns at the end of each line but not if the carriage return is preceded by a period, exclamation point or question mark.
Just doing a find and replace with /r in grep removes all carriage returns so melds all paragraphs together as well as the intra-paragraph carriage returns. I would like to keep the paragraphs, but separate them with one carriage return.
How do I adjust the grep? Thanks.
Copy link to clipboard
Copied
I think Positive Lookahead would the answer:
https://carijansen.com/positive-lookahead-grep-for-designers/
but you need to wait for experts to confirm.
Copy link to clipboard
Copied
Actually I think negative look behind, but I'm not one of the "experts".
Copy link to clipboard
Copied
(?<!\.|!|\?)\r
This will find a return only if not preceded by a period, exclamation mark, or question mark.
Copy link to clipboard
Copied
Well, I can't really call myself an expert either, but maybe I'm "intermediate"? (My syntax is going to be ugly.)
I've processed a whole lot of text extracted from PDFs in the last few decades. The answer to your question, if taken literally, is quite simple:
(?<!\.|\?|\!)\r
It's what Peter suggested; a negative lookbehind. Lots of ways to do it, of course, but this is what immediately occurred to me as an answer to your question. This assumes, of course, that nowhere in the text you're cleaning up is there an instance of sentence-ending punctuation followed by a space, or a close quote, or a close parenthesis, or anything else.
Copy link to clipboard
Copied
Find: (?<![.!?])(\h*\r)+
Replace by: a normal space
(^/) The Jedi