Unicode bug?

Report · Nov 23, 2016

Usually I have found FrameMaker's Unicode support to be pretty good, but today I wanted to use a character in the "Mathematical Alphanumeric Symbols" range (effectively just an italic serif capital I, to indicate current in an engineering expression) and it didn't appear to work...

I found the character using the "BabelMap" careware utility, Paste-specialled it into Frame, and applied Microsoft's Cambria Maths font to it... and Frame still stubbornly just displayed it as a "?".

This Unicode range was introduced in 2001! I'd have thought Frame would support it by now... /sighs/

Report · Nov 23, 2016

Probably not FM's fault - the font is the culprit I suspect ;>)

Report · Nov 23, 2016

Jeff, why do you say that? I have no particular way to rigorously test this other than to say "well it works absolutely fine in MS Word and Babel Map"... but it doesn't work in Frame.

Report · Nov 23, 2016

Hi,

can you give us the exact codepoint of the glyph? Is it " "? Is it U+1D43C? ("Mathematical Italic Capital I")

Report · Nov 23, 2016

Stefan, it is from this Unicode block here: http://www.unicode.org/charts/PDF/U1D400.pdf

And the particular character that I was trying to use was, yes, U+1D43C "MATHEMATICAL ITALIC CAPITAL I"

I appreciate I can just do a workaround by typing an ordinary letter I and applying a character style to make sure it uses a serif italic font. But it ought to work properly to begin with

Report · Nov 23, 2016

Does the font in use populate that code point (\u1d43c)?

There are some wider questions here that might be worth clarifying:

Does FM support the SMP (Supplementary Multilingual Plane, not to mention SIP) or just the BMP? I'm not running a Unicode-capable version of FM, so can't test it.

In any planes supported, does FM implement fallback when a font doesn't populate a glyph at the code point desired?

I imagine that FM doesn't include a fallback font, such as SIL's. Does Windows? Does it populate 1D43C? And if not?

Report · Nov 23, 2016

Bob, some of those terms go a bit over my head - google found me Adobe's spec page for Frame 8's unicode capabilities, if that helps:

Adobe FrameMaker 8

SImilarly exhaustic pages for more recent versions didn't come up, although maybe they're out there in Adobe's "help cloud" somewhere...

Report · Nov 23, 2016

re: some of those terms go a bit over my head

Yeah, they were mostly directed at users who have already poked around on this a bit, or Stephan, who should be able to find out easily.

re: google found me Adobe's spec page for Frame 8's unicode capabilities, if that helps:

It confirms that FM doesn't inherently exclude characters above the BMP (i.e. characters that take more than 16 bits or 2 bytes to represent).

It does vaguely suggest a possible issue with search, as FM uses UTF-16 for that, but doesn't state whether it's restricted to characters that can be represented in a single UTF-16 "code unit" (16-bits) or whether code unit pairs (32-bits) are acceptable. You can get a sense of the complexities here from the Wiki pages on Unicode and UTF-16.

Frankly, because the character of interest here can be mimicked with a common italicized I from a common serif font, I'd be tempted to implement any instances of it as a Variable.

Character Format:
Tag: MathItalic
Def: all blank or As-Is except for
Family: TimesNewRoman
Angle: Italic

Variable:

Name: U+1D43C MATHEMATICAL ITALIC CAPITAL I
Def: <MathItalic>I

That gives future document stewards an idea of what was intended, against the day when it can reliably be done as native Unicode.

Report · Nov 23, 2016

To be honest, I'm happy enough just applying the font style using a character format... as that is still going to be necessary even if the Unicode *did* work (because that code block is only available in a few specialised fonts like MS Cambria Math).

To me, the main advantage of using the Unicode character at all (other than sheer pedantry ) is that the documents in questions get translated into a variety of languages, include Chinese, and having special characters like these in their correct unicode encodings helps them stay looking correct when translated, as it emphasises to the translation house that they have a special semantic signifigance.

Report · Nov 23, 2016

re: I'm happy enough just applying the font style using a character format...

I'm never happy when I have to use a C Tag. They're a nuisance for stewardship.

re: ...(because that code block is only available in a few specialised fonts like MS Cambria Math).

Yeah, just double checking on the SIL fallback font, even that is BMP only. I wonder if anyone today provides a fallback font for the whole of Unicode.

re: To me, the main advantage of using the Unicode character at all (other than sheer pedantry ) is that the documents in questions get translated into a variety of languages, include Chinese, and having special characters like these in their correct unicode encodings helps them stay looking correct when translated, as it emphasises to the translation house that they have a special semantic signifigance.

The variable approach I exampled above does that too.

Report · Nov 24, 2016

The variable approach is a good cue to the translators, yes,

but having the correct Unicode character in the resulting PDF (probably) makes the resulting document more accessible, as screen readers and the like can make better choices about how to render the character. Well, in principle anyways (in practice they may not support that Unicode character either, or not do anything intelligent with it! )