Accurately sorting Polytonic Greek paragraphs
For paragraph sorting, I generally use the language-aware paragraph sorter which @Peter Kahrel made many years ago, and which mostly works well.
For Polytonic Greek, however, it fails. There is no built-in sort order for Polytonic Greek, so I’ve added my own:
0123456789 Α[ΆἈἉἊἋἌἍἎἏᾈᾉᾊᾋᾌᾍᾎᾏ]ΒΓΔΕ[ΈἘἙἚἛἜἝ]ΖΗ[ΉἨἩἪἫἬἭἮἯᾘᾙᾚᾛᾜᾝᾞᾟῊΉῌ]ΘΙ[ΊΪἸἹἺἻἼἽἾἿῘῙῚΊ]ΚΛΜΝΞΟ[ΌὈὉὊὋὌὍῸΌ]ΠΡΣΤ[ΥὙὛὝὟῨῩῪΎ]ΎΦΧΨΩ[ΏὨὩὪὫὬὭὮὯᾨᾩᾪᾫᾬᾭᾮᾯῺΏῼ]ABCDEFGHIJKLMNOPQRSTUVWXYZÆØÅ
Note the plethora of accented vowel variant in square brackets after each base vowel. As far as I understand the instructions on the page linked to above, this is how I should indicate that diacritics should be ignored.
Unfortunately, it doesn’t work: all accented vowels just end up at the end of the list (in seemingly random order). If I remove the brackets, it sort-of-kind-of works, in that accented vowels are then at least ordered with the base vowel instead of at the end of the alphabet – but they’re still treated as a separate letter, so a sequence like (correctly sorted) ἑβή - ἔξα - εὐνοῖα - ἐώρων ends up as εὐνοῖα - ἐώρων - ἑβη - ἔξα.
Am I doing something wrong here, or does the script just not work properly for Polytonic Greek? And if it doesn’t, is there a better way to sort Polytonic Greek paragraphs while maintaining formatting?
