Copy link to clipboard
Copied
I'm cleaning up some text where often the space was omitted after punctuation. Like this:
There should be a space after a comma,or a semicolon,for example;it's often missing (but not after opening parentheses or apostrophes).
The problem is that when using a Grep Posix search for [[:punct:]](?=\w) it finds all punctuation. Is there a way to exclude ( and ' from the [[:punct:]] wildcard?
Copy link to clipboard
Copied
Hi @defaultu0e43kqloi6n, well here's a straightforward (warning: probably naive!) approach:
([!%&*+,-.:;?)}\]])(\S)
Edit: I removed some unnecessary escaping inside the [ ], thanks to @FRIdNGE's help.
These are the characters that would seem to me to be the problems in ordinary latin text. Because I excluded apostrophe, it won't pick up trailing single quotes with no space. eg. this won't be caught: ‘The quick brown fox’The next sentence.
Be careful when copy/pasting the grep string above—Indesign or your OS can remove escape characters sometimes.
ChangeTo:
$1 $2
To add a space between the punctuation and the non-space character.
If you want to change the list of punctuation, just add or remove from between the first [ and the last ].
- Mark
Copy link to clipboard
Copied
Enough:
[!%&*+,-.:;?)}\]]
(^/) The Jedi
Copy link to clipboard
Copied
Ah, we don't have to escape characters inside [ ] except ] of course! Thanks!
Copy link to clipboard
Copied
Thanks, I've been using something similar, just fishing for whether there was possibly something out there I didn't know about customizing posix.
Copy link to clipboard
Copied
Yeah I had a look but I couldn't find anything. The POSIX character classes such as [:punct:] don't seem to be editable.
Copy link to clipboard
Copied
Use negative lookahead
(?![\x{0027}\x{0028}])[[:punct:]]
Negative lookahead is successful if can not match to the right, so in this case is successful if the character on the right is not apostrophe or left parenthesis. Then search for punctuation.
Search any word character followed by punctuation, except ' or (:
(\w)((?![\x{0027}\x{0028}])([[:punct:]]))
Change:
$1 $2
Copy link to clipboard
Copied
Cool! Thanks.