Copy link to clipboard
Copied
I'm an experienced InDesign & GREP user, and experienced in Bengali typesetting. But I've encountered an issue I can't solve. I have some Unicode Bengali text (converted from ANSI encoding) with some invisible characters hindering the correct formation of the characters. When I cut & paste the problem word into Find/Change, it renders it as ~I in GREP (or ^I in 'Text'). But when I search for either of these characters, or even for the text I just copied, it cannot find it. When I copy the offending text into another software and then cut & paste it back, the offending characters are removed. My questions are:
1. What are these invisible problematic ~I/^I characters?
2. How can I remove them with InDesign's Find/Change?
I've included a screenshot here and attached a sample file.
Copy link to clipboard
Copied
The character is a strange beast. It's used for the index marker, text anchor, and a few other things. When you select it and open the Info panel you'll see that its Unicode value is FEFF. In the text tab you can find it by searching for <FEFF>. Unfortunately it's not possible to find it in the Grep tab. \x{FEFF} won't find it, for example.
In the story editor you'll notice that it looks like a text anchor, but that symbol you see there could be used for other purposes as well, I forget.
Copy link to clipboard
Copied
Thanks, Peter, that was just the reply I was hoping for! Very helpful.
Copy link to clipboard
Copied
To expand on Peter's answer - outside of InDesign, in the broader world of Unicode, that FFEF is a zero width no-break space. I think it is an artifact of the conversion process.... Seems to me that it should be a different character that is actually used in Bengali Unicode all the time, 200C, the zero width non-joiner. The 200C ZWNJ does what it says on the tin - it keeps two glyphs that would ordinarily connect via ligature from either connecting or breaking at that point.
Sometimes I myself have to use convertors to get old complex-script text into Unicode, but honestly I don't trust them unless I write them myself. You never know what is going to creep in. In this case, what crept in is also the code point that InDesign uses internally for a bunch of stuff. If possible, in your shoes I would re-convert and intervene to replace the (IMHO wrong) zero width no-break space with the zero-with non-joiner. Alternately, I do know that someone has at some point written a script to find it, although I don't have a bookmark to it immediately to hand.