Skip to main content
Participant
June 8, 2011
Question

UTF-8 text and Type1 Fonts

  • June 8, 2011
  • 1 reply
  • 6742 views

In my application I receive UTF-8 encoded text strings.  The text includes Western European, Eastern European, Cyrillic, and Greek characters.  CJK support is not needed.  I then need to display it using a Type 1 font that contains all of the needed glyphs (580+ glyphs).

I would like to be able to use the show, stringwidth, ... operators with the UTF-8 encoded text.

Is there a way to build a UTF-8 compatible font using a Type 1 font as its base font.  composefont? Other method?

I would greatly appreciate it you would point me to some concrete examples of how to do this.  I am pretty new to UTF-8 and CIDFont/CMap/composefont stuff.

Thanks for your help,

  Lee

    This topic has been closed for replies.

    1 reply

    MiguelSousa
    Community Manager
    Community Manager
    June 8, 2011

    Hi Lee,

    You can only encode about 256 glyphs in a Type 1 font (the font can have more glyphs but AFAIK they won't be encoded), so your options are mostly limited to using a TrueType font or an OpenType font.

    You can use the Adobe Font Development Kit for OpenType (a.k.a. FDK) to make OpenType fonts using Type 1 fonts as source data. There's a link to an example font at the bottom of the page.

    LDHills63Author
    Participant
    June 8, 2011

    Hi Miguel,

    I will back up a little more in my development process...

    A font vendor provided us with a custom OpenType PS font containing 580+ characters, which covers every language we needed.  I then opened the fonts in FontLab 5 and saved them as PFA (Type 1 ascii format) for embedding on our device.

    We can reference these fonts on our device using Adobe Illustrator EPS files, including all of the western euro, eastern euro, cyrillic, and greek glyphs.  There are no fonts embedded in the EPS files.  It appears that the EPS files do lots of reencoding of the font.

    Yes, a Type 1 font is limited to 256 in the /Encoding vector. 

    I am wondering if there is a way to build a CIDFont using some sort of UTF-8 CMap and our existing embedded fonts so that our UTF-8 text strings (from sources other than EPS files) can be displayed.  It would be great to form a font XXXX-UTF8 that could be used for (my utf-8 string) show/stringwidth.

    Are there other options?  What about putting the font on the printer device in %hostfonts% in OpenType font format?  Could that be used by the EPS files and somehow by the UTF-8 strings?

    Thanks,

      Lee

    June 8, 2011

    Lee,

    Is there a specific reason why you cannot embed the original OpenType fonts on your device? That would be the best route, because everything that you need should be packaged into the font file.

    UTF-8 is an ideal way to encode Unicode text for the Web and some other uses, because ASCII is a pure subset. But, there are no OpenType 'cmap' table formats that support it. Instead, there are OpenType 'cmap' formats for BMP-only UTF-16 (Format 4) and UTF-32 (Format 12), and most OpenType fonts include one of both of these 'cmap' table formats. To access the glyphs, you simply transcode UTF-8 to UTF-16 or UTF-32, and use the appropriate OpenType 'cmap' subtable. The conversion is purely algorithmic, and thus fast.

    Building a CID-keyed font and using a UTF-8 CMap resource is not the right solution.