• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
1

How to remove hard returns from end of each line only if not proceeded by a period, etc

New Here ,
Sep 26, 2023 Sep 26, 2023

Copy link to clipboard

Copied

I have the ususal problem. I am trying to remove carriage returns at the end of each line but not if the carriage return is preceded by a period, exclamation point or question mark.

Just doing a find and replace with /r in grep removes all carriage returns so melds all paragraphs together as well as the intra-paragraph carriage returns. I would like to keep the paragraphs, but separate them with one carriage return.

How do I adjust the grep? Thanks.

TOPICS
EPUB , Publish online

Views

580

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 26, 2023 Sep 26, 2023

Copy link to clipboard

Copied

I think Positive Lookahead would the answer:

 

https://carijansen.com/positive-lookahead-grep-for-designers/

 

but you need to wait for experts to confirm. 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 26, 2023 Sep 26, 2023

Copy link to clipboard

Copied

Actually I think negative look behind, but I'm not one of the "experts".

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 26, 2023 Sep 26, 2023

Copy link to clipboard

Copied

(?<!\.|!|\?)\r

This will find a return only if not preceded by a period, exclamation mark, or question mark.

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Sep 26, 2023 Sep 26, 2023

Copy link to clipboard

Copied

Well, I can't really call myself an expert either, but maybe I'm "intermediate"? (My syntax is going to be ugly.)

 

I've processed a whole lot of text extracted from PDFs in the last few decades.  The answer to your question, if taken literally, is quite simple:

(?<!\.|\?|\!)\r

It's what Peter suggested; a negative lookbehind. Lots of ways to do it, of course, but this is what immediately occurred to me as an answer to your question. This assumes, of course, that nowhere in the text you're cleaning up is there an instance of sentence-ending punctuation followed by a space, or a close quote, or a close parenthesis, or anything else. 

 

 

 

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Sep 27, 2023 Sep 27, 2023

Copy link to clipboard

Copied

LATEST

Find:  (?<![.!?])(\h*\r)+

Replace by: a normal space

 

(^/)  The Jedi

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines