• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

UTF-8 encoding and "show"

Guest
Apr 13, 2014 Apr 13, 2014

Copy link to clipboard

Copied

Hi,

How may I show UTF-8 character signs? I'd like to show polish signs.

I know there is something like CID fonts. I don't know what's all about it, however.

Please help...

TOPICS
Programming

Views

8.4K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 13, 2014 Apr 13, 2014

Copy link to clipboard

Copied

There is no way to use UTF-8.

There is no escape from getting a full understanding of font encoding, generating a font with a suitable encoding, and generating a show string suitably encoded.

Certain Latin2 symbols are font in the default fonts, otherwise you need to obtain and embed a suitable font.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 13, 2014 Apr 13, 2014

Copy link to clipboard

Copied

Test Screen Name wrote:

There is no escape from getting a full understanding of font encoding, generating a font with a suitable encoding, and generating a show string suitably encoded.

So... How may one do that? Where may one find suitable documentation?

Isn't (La)TeX files converted to PS before printing? If so, it should be possible to print UTF-8 signs.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

Where may one find suitable documentation?

PostScript Language Reference Manual, a constant friend to those writing PostScript.

Isn't (La)TeX files converted to PS before printing? If so, it should be possible to print UTF-8 signs.

Your point is perfectly true, but based on a misunderstanding.

PostScript was written with an understanding of the needs of typesetting in the many languages of the world, certainly including Polish. But PostScript was written long before Unicode was invented. So, while you can use the symbols needed to typeset the Polish language you cannot simply use the Unicode representation of the Polish language. Entirely different.

It is a fact that you can learn how to typeset Latin1 in an afternoon of looking at samples. It is also a fact that typesetting other character sets often requires deep and long study. Long and frustrating, but rewarding but interesting. If you don't want to go through all that study, I recommend use use LaTeX... we can help, but don't expect to find the solution in a few lines of code.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

(Unless your needed characters are all in the Adobe standard font character set, of course, in which case it's just a case of re-encoding).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

I think you are in luck. In Appendix E of the 3rd edition, you find all the necessary characters for Polish, Ą Ć Ę Ł Ń Ó Ś Ź Ż

Ą Ć Ę Ł Ń Ó Ś Ź Ż

These are available by their Encoding name in many fonts, and with luck in the fonts included with a level 3 interpreter. However... many printers still in use are level 2. It is probably safest to embed a font as I suggested if you want general support. Or, to be simpler, use the Ł character, which is available in StandardEncoding, and composite your own character with accents. Ogonek, acute and dot accent are all standard.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

Test Screen Name wrote:

I think you are in luck. In Appendix E of the 3rd edition, you find all the necessary characters for Polish, Ą Ć Ę Ł Ń Ó Ś Ź Ż

I've found that characters (in Appendix E of the 3rd edition). There's a footnote connected with them, however:

     These characters are present in the extended (315-character) Latin character set, but not in the original (229-character) set.

So... Do you know how may I use this characters? Could you give me a simple example?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

As I have already said, you re-encode the font to use these names, a fundamental thing. Really, as the encoding of a font is not known, every use of fonts should re-encode.  5.9.1 shows an example of this with ISOLatin1Encoding. Just replace this built in name with your own derivation as a 256 element array of names. See 5.3 also.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

Also see the PostScript Green Book, which is a good primer for PostScript programmers, as it describes good practice (like ALWAYS re-encoding fonts) rather than just syntax. Happily, it is available online http://www-cdf.fnal.gov/offline/PostScript/GREENBK.PDF

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 14, 2014 Apr 14, 2014

Copy link to clipboard

Copied

I tried this:

true setglobal

/f /Helvetica findfont 100 scalefont def

/enc f /Encoding get def

enc 0 /Aogonek put          % error here

f /Encoding enc put

f setfont

0 setgray

/txt 2 string def

10 600 moveto

/A glyphshow

/Aogonek glyphshow

txt 0 0 put          % put Aogonek code

txt show          % try to show Aogonek

showpage

but the error is:

Error: /invalidaccess in --put--

Operand stack:

   --nostringval--   0   Aogonek

Execution stack:

   %interp_exit   .runexec2   --nostringval--   --nostringval--   --nostringval-

-   2   %stopped_push   --nostringval--   --nostringval--   --nostringval--   fa

lse   1   %stopped_push   1950   1   3   %oparray_pop   1949   1   3   %oparray_

pop   1933   1   3   %oparray_pop   1819   1   3   %oparray_pop   --nostringval-

-   %errorexec_pop   .runexec2   --nostringval--   --nostringval--   --nostringv

al--   2   %stopped_push   --nostringval--

Dictionary stack:

   --dict:1182/1684(ro)(G)--   --dict:0/20(G)--   --dict:81/200(L)--

Current allocation mode is global

Last OS error: No such file or directory

Current file position is 443

GPL Ghostscript 9.14: Unrecoverable error, exit code 1

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 15, 2014 Apr 15, 2014

Copy link to clipboard

Copied

Your code is going in the right direction, but it fails because you are trying to write readonly objects. Both the Helvetica font dictionary and its Encoding array belong to the system (in a printer they exist in ROM) and cannot be modified. You have to duplicate the objects and redefine the font. It's also vital to delete FID from the duplicated font dictionary. Refer to the Green Book for a possible technique.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 15, 2014 Apr 15, 2014

Copy link to clipboard

Copied

Thanks. I have now:

/F { %def

   findfont exch scalefont setfont

} bind def

/RE { %def

   findfont begin

   currentdict dup length dict begin

   { %forall

      1 index /FID ne {def} {pop pop} ifelse

   } forall

   /FontName exch def dup length 0 ne { %if

   /Encoding Encoding 256 array copy def

   0 exch { %forall

      dup type /nametype eq { %ifelse

         Encoding 2 index 2 index put

         pop 1 add

      }{ %else

         exch pop

      } ifelse

   } forall

   } if pop

   currentdict dup end end

   /FontName get exch definefont pop

} bind def

/myencoding [ 0 /Lslash 1 /Aogonek 2 /Cacute 3 /Eogonek 4 /ogonek 5 /dagger ] def

myencoding /myfont /Helvetica RE

100 /myfont F

10 620 moveto (\000\001\002\003\004\005) show

showpage

But only Lslash, ogonek and dagger are shown one the page. It's probably according to the footnote I posted earlier:

      These characters are present in the extended (315-character) Latin character set, but not in the original (229-character) set.

How may I use the extendend Latin character set?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 15, 2014 Apr 15, 2014

Copy link to clipboard

Copied

You cannot guarantee that the built in fonts contain these characters. Unless you are lucky with your printer  and know you will always use the printer, you must therefore

- obtain a suitable font (preferably in type 1 format)

- convert to PFA

- include it in your PostScript

- find, reencode, use that font

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guest
Apr 15, 2014 Apr 15, 2014

Copy link to clipboard

Copied

Hmm... It all seems to be very complicated to me. I don't know where to find suitable font (must be for free), etc.

Couldn't it be easier? Maybe create some composite characters?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Apr 15, 2014 Apr 15, 2014

Copy link to clipboard

Copied

LATEST

Yes, it's certainly complicated to embed fonts.

All the accents you need are available in standard fonts, so you can combine them. Making a composite character isn't really viable, but you can set two characters, with the necessary adjustment to cause overlaying. Unfortunately, if you make a PDF, your text will not be properly extractable, but that may not be a concern.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines