Skip to main content
Inspiring
June 7, 2025
Answered

Question about Unicode input behavior in Find/Change dialog

  • June 7, 2025
  • 2 replies
  • 261 views

I'm reading Peter Kahrel's "GREP in InDesign 3rd Edition" and have a question about Unicode character input behavior.

 

According to the book (page 7), when you enter <014B> in the Find What field, it should convert to the actual character, but if your screen font doesn't support it, it displays as an empty square □.

 

However, when I test this with codes like <20000> or <2F800> on Windows, they remain as code format (e.g., <020000>) rather than converting to characters or showing empty squares.

 

Has anyone experienced the "empty square" situation described in the book? I'm wondering if this behavior has changed in recent InDesign versions, or if I'm missing something.

 

Any insights would be appreciated! Thanks.

Correct answer Joel Cherney

However, when I test this with codes like <20000> or <2F800> on Windows, they remain as code format (e.g., <020000>) rather than converting to characters or showing empty squares.

 

That's... not what I experience when I try it! It's weird enough that I screencapped a GIF of it.

 

1) Pasting <020000> into the Text tab of the F&R

2) Switching to the GREP tab and pasting <020000> in

3) Going back to the Text tab

4) Going back to the GREP tab

... what?

 

I expected, when I read your message but before I started experimenting, that the problem was that you were trying to use UTF-32 encoding but that InDesign was only going to accept UTF-16 encoded stuff. So I looked up the glyph at 020000, figured out what its UTF-16 encoding was, and then pasted it in.

 

1) Pasting <D840><DC00> into the Text tab of the F&R

2) Switching to the GREP tab and pasting <D840><DC00> in

    2b) being bewildered

3) Going back to the Text tab

4) Going back to the GREP tab

     4b) being bewildered again

 

 

Special bonus weirdness:  I type my bewildered complaint "noooo I have <D840><DC00> on the clipboard" then switch away to the Text tab, then back to the GREP tab, where everything after the <D840><DC00> has been removed from the Change To field. 

 

I submit to you a possible explanation: this feature is half-baked. I suspect that very little testing has been done around UTF-32 encoding. 

 

Only very rarely have I experienced a square you've-dropped-a-glyph box ("tofu") in InDesign's interface, and at that, I've only seen it when I'm working in a writing system where the font designer has crowded the Private Use Area with custom glyphs. (Lots of font developers working with SE Asian languages have used this hack, over the years.) 

2 replies

Peter Kahrel
Community Expert
Community Expert
June 7, 2025

I see the same weird behaviour that Joel described (in ID 2024), and agree that it looks half-baked.

 

But it does work in that you can use it: when you enter <020000> in the Findwhat field, it does find the character in the document.

Joel Cherney
Community Expert
Joel CherneyCommunity ExpertCorrect answer
Community Expert
June 7, 2025

However, when I test this with codes like <20000> or <2F800> on Windows, they remain as code format (e.g., <020000>) rather than converting to characters or showing empty squares.

 

That's... not what I experience when I try it! It's weird enough that I screencapped a GIF of it.

 

1) Pasting <020000> into the Text tab of the F&R

2) Switching to the GREP tab and pasting <020000> in

3) Going back to the Text tab

4) Going back to the GREP tab

... what?

 

I expected, when I read your message but before I started experimenting, that the problem was that you were trying to use UTF-32 encoding but that InDesign was only going to accept UTF-16 encoded stuff. So I looked up the glyph at 020000, figured out what its UTF-16 encoding was, and then pasted it in.

 

1) Pasting <D840><DC00> into the Text tab of the F&R

2) Switching to the GREP tab and pasting <D840><DC00> in

    2b) being bewildered

3) Going back to the Text tab

4) Going back to the GREP tab

     4b) being bewildered again

 

 

Special bonus weirdness:  I type my bewildered complaint "noooo I have <D840><DC00> on the clipboard" then switch away to the Text tab, then back to the GREP tab, where everything after the <D840><DC00> has been removed from the Change To field. 

 

I submit to you a possible explanation: this feature is half-baked. I suspect that very little testing has been done around UTF-32 encoding. 

 

Only very rarely have I experienced a square you've-dropped-a-glyph box ("tofu") in InDesign's interface, and at that, I've only seen it when I'm working in a writing system where the font designer has crowded the Private Use Area with custom glyphs. (Lots of font developers working with SE Asian languages have used this hack, over the years.)