How to troubleshoot Double Hyphenation in Finish Compound Words Using GREP in InDesign?

Report · May 16, 2023

Finnish is a language with many hyphenated compound words. I am looking for a way to prevent double hyphenation in those by using GREP in paragraph styles. The solution I came up with is creating a “no break” character style and applying it to both sides of the hyphen

[\l\u]+(?=-)

and

(?<=-)[\l\u]+

But for some reason InDesign will then not hyphenate the word at the correct location either. Even adding a discretionary hyphen does not help. What am I doing wrong?

Here’s a picture to clarify what the problem is (with magenta color added to the "no break" character style):

Report · May 16, 2023

Instead of applying No Break, applying no language should keep InDesign from arbitrarily hyphenating, but it will also keep it from spell checking those words.

It may also work to add a discretionary hyphen to the beginning or the word (without changing the language) which would preseerve the spell checking, but you can't do that with a GREP style, only Find/Change or by editing the dictionary.

Report · May 16, 2023

Applying no language works. Thank you so much, Peter!

Report · May 17, 2023

While Peters solution will work just fine, I’m not a big fan of using something ›in the wrong way‹, to get a result. Not having set the correct language could cause other issues later on.

To my understanding this is why your original solution doesn’t work:
The hyphenation occures between two characters and each of the characters must have breaking activated.

If the character after the hyphen is allowed to break it works:

To get this, you could alter your GREP query like this:

[^ -]+(?=.-)|(?<=-.)[^ -]+

First, I did put everything in one single query.
The first option ([^ -]+(?=.-) / everything before |) catches the first word without its last character, the second ((?<=-.)[^ -]+) the second word without its first character.

The trick is, to put a dot (= any character) in the lookahead and lookbehind, to exclude the one character from the result. The dot in the first option is not strictly necessary. It makes sure, also words with more than one hyphen are formatted correctly:

Report · May 18, 2023

This is propably some weird quirk of the Finnish hyphenation algorithm, but that doesn’t help either:

Screenshot 2023-05-18 at 12.43.24.png

Report · May 18, 2023

Your revised expression doesn't seem to work for me in English, either, and I don't know why.

While I also don't like using things "in the wrong way" I'm a pragamtist when it comes to typesetting. The use of [No Language] to prevent hypenation is fairly well established as a workaround and the lack of spell-checking seems a small risk -- spell check is not a substitute for proof-reading.

Report · May 19, 2023

This is indeed strange. My expression does work in all hyphenation versions of German (and many other languages), but not in Finnish. It also works only in one single version of English: »English: USA Legal«.

So I agree, [No Language] is the best solution here.

My remark about using things ›in the wrong way‹, was more a general concern. I’m working in a small company, where people often find solutions that are working ›in the wrong way‹ and might or might not break later on (often times they do).

In this case I didn’t even think of spell-checking. The thing that came into my mind, was accessibility: I’m not even sure, if language information makes it into a finnished PDF (or maybe ePub). But if it does, a screenreader will use it for correct articulation. If some words are without a language, this is not possible.

That said, even if your final product is indeed a PDF, I would go with the [No Language] solution.

How to troubleshoot Double Hyphenation in Finish Compound Words Using GREP in InDesign?

1 Correct answer