Skip to main content
Participant
February 18, 2023
Question

how to replace all spaces after numbers with non-breaking spaces

  • February 18, 2023
  • 2 replies
  • 3609 views

I have a document with many numbers, some followed by a unit abbreviation (15 kWh), and some not ("the year 2022 had..."). I want to change any space that is between a numeral and a unit with a non-breaking space without changing the numeral or the abbreviation. How can I do this quickly and easily with the Find/Change dialogue box/panel? I suspect there is a way to do this using GREP, but I am totally new to GREP and cannot find or figure out the code to put in the "Find what:" and the "Change to" fields in the Find/Change panel.  Thank you. 

-Seth

2 replies

Community Expert
February 18, 2023

I generally have to do something very similar 

here's an almost complete list in grep

 

The problem can be some people write 2 M or 2 kM or 2 Km or 2 km or anything like that 

so I added a case sensitive look too

 

\d\K (?=(?i)\b(m|km|cm|mm|in|ft|yd|mi|m²|km²|cm²|mm²|in²|ft²|yd²|ac|L|mL|m³|ft³|in³|gal|qt|oz|fl oz|kg|g|mg|lb|°[CF]|K|s|min|h|day|week|month|year|m/s|km/h|mph|Pa|kPa|atm|psi|J|kJ|cal|kcal|Wh|kWh|W|kW|hp|A|V|ohm|Hz|kHz|MHz|GHz|bit|byte|KB|MB|GB|TB)\b)

 

as pointed out
\d is a digit

\K is a positive lookbehind

followed by a space

 

(?= is the opening of the lookahead [it's closed at after the \b at the end

(?i) will ignore case

\b is a word bound

( is the start of a set

 

then we add the units separated with | as this means OR
m|km|cm|mm|in ... etc.

we close the set with a )

\b to finish the word bound

) to finish the look ahead

 

to alter the grep - remove units - you might want to remove year month week day - technically these are units of time

 

But you can add remove as you need them.

 

This is the sample of text I used

 

The box is 30 cm x 20 cm x 15 cm and weighs 2.5 kg. It contains 500 mL of water and 250 g of sugar. The temperature is 25°C and the pressure is 1 atm. The speed of the car is 80 km/h and the distance traveled is 120 km. The power output of the engine is 150 hp and it consumes 12 L of gasoline per 100 km. The memory card has a capacity of 32 GB and a transfer rate of 100 MB/s. The frequency of the processor is 3.5 GHz and the resistance of the circuit is 100 ohm. The duration of the movie is 2h 30min and its file size is 1.5 GB. The company has 100 employees and its revenue was 10 million dollars last year. The book has 500 pages and its ISBN is 978-3-16-148410-0. The building is 50 meters tall and has 20 floors. The recipe calls for 2 cups of flour and 1 tablespoon of salt.

 

In the GREP - tablespoon is not represented so you'd need to add that.

m1b
Community Expert
Community Expert
February 18, 2023

Hey thanks for sharing your list @Eugene Tyson, I had a quick look for something similar online but didn't find it. Sample paragraph is great, too. 🙂

 

One *very* minor point of clarification in case anyone is confused: the \K symbol in grep tells the engine to discard the matched text up until that point. In effect it works the same as a lookbehind so, as I say. very minor point. However I just realised that, in Indesign at least, \K works better than a lookbehind here because it handles variable length text, which lookbehind does not—you cannot do (?<=\d+)\s for example, because there are variable number of digits matched.

- Mark

Community Expert
February 18, 2023

Funny enough I only recently found out how to use the \K in grep after seeing it for years.

 

One of those things ha ha

Barb Binder
Community Expert
Community Expert
February 18, 2023

Is it alwayts kWh? If yes,

 

 

~Barb

~Barb at Rocky Mountain Training
m1b
Community Expert
Community Expert
February 18, 2023

As per @Barb Binder's answer, you can add other units to check for by putting them in a capture group of options separated by the | bar character.

(\b\d+) (?=(Wh|kWh|MW|kV|MV)\b)

I've also added \b which matches to a word boundary. Not sure if the first one is useful but the second \b means that it won't match, say, the 150 MV in 150 MVP.

- Mark

Community Expert
February 18, 2023

Edit: I'm wrong here. Please ignore. See Eugene's response below for details.

 

By the way, I think [\l\u] is the same as \w. Let me know if I'm not right though—I've got a lot to learn about grep. - Mark


\l\u are lower or upper case so I'd prefer this [\l|\u]

It behaves like \w but sometimes \w captures digits too.