Skip to main content
Inspiring
June 20, 2018
解決済み

em/en dash characters

  • June 20, 2018
  • 返信数 2.
  • 2496 ビュー

A vendor of ours creates documents from Word inputs, and emits postscript.

By default, they embed their fonts into the postscript.  But they offer the option not to do so, relying on printer-resident fonts, for better performance.

When we try this, two characters are printed as empty square boxes:  the em dash and the en dash.

The PDF we get from Distiller for that postscript input looks the same: boxes for dashes.

Looking into the postscript, I see that these two characters are delivered as octals, \226 and \227.

They are right in the middle of text.  I.e., "THIS - THAT" might be (THIS \226 THAT)

Is this wrong?  Is there an encoding that Distiller would interpret rightly?

このトピックへの返信は締め切られました。
解決に役立った回答 arthurg2248747

The printer resident Arial font of the base 136 resident Adobe PostScript 3 printers doesn't even begin to match the Arial font on modern Windows systems either in character set or encoding. The printer resident font dates from approximately 1997 and has 256 glyphs or less. The current Arial has over 4600 glyphs and Unicode encoding.

          - Dov


I know what you are thinking: will this guy never say "answered" and go away?  But bear with me for a moment.

I would embrace the "old printer" answer, but for two things.  First, we have another application which prepares ps input for the same printer, and gets the en/em dashes printed just fine.  Granted, this application uses a different font - Helvetica.  And I cannot test that in my Word-based application, because we don't have Helvetica on our laptops.  Something to do with royalties.

I've looked at the postscript produced by the other app, and it is quite different from ours.  But it includes <96>, which is the hexadecimal version of \226.  Even more frustrating, Distiller gives us a PDF that also displays the right dashes!  How is that an "old printer" issue?

In case you are counting, objection 1 = same printer works with different app; objection 2 = if it's printer related, why does Distiller track that printer so closely.

Your last remarks focus on Arial and Times - perhaps Helvetica has a different history?

返信数 2

Legend
June 25, 2018

What impact do you see? Does it affect the PDF, the distilling, or the PostScript generation?

Inspiring
June 25, 2018

Thanks for your interest!

If we continue telling the package to embed fonts, we get grotesque increases in print time.  This is a work flow problem, not easily addressed.  The way we use the package results in many individual PS files that are then concatenated through scripts.  So the printer sees "load fonts", "print a few pages", "load the same fonts" ... we could look at a writing a post processing program, but we really try to stay away from that sort of thing.

If we say "do not embed", then we don't see en dashes or em dashes, either on paper, or in a distilled PDF.  Instead there is an empty box, no doubt signifying that the character can't be printed.

Dov Isaacs
Legend
June 20, 2018

The concept of using “printer-resident fonts” and that use of such fonts improves performance dates back to the late 1980s and early 1990s when communications with a PostScript printer was primarily with slow speed serial ports. Font data took too much communications bandwidth. But with the advent of high speed IEEE 1284 parallel ports, USB 2 ports, and Ethernet support on such printers, there is really absolutely nothing to be gained in terms of printing performance by relying on printer resident fonts.

You should also keep in mind that the printer resident fonts, at least for Adobe PostScript printers, are for the most part very old Type 1 versions of Helvetica, Times, etc. with very limited character sets that may be encoded very differently from the fonts resident on you host computer systems. This really becomes a problem when trying to use printer-resident Helvetica and Times as substitutes for Arial and Times New Roman used with documents on a host computer. Those differences in character sets and encodings are what are causing the problems.

You are best off to configure your drivers and applications to always use the host-based fonts.

And of course, when producing PDF from Office applications, you are best off using Acrobat's PDFMakers that generate PDF directly without using any intermediate PostScript.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)
Inspiring
June 20, 2018

Dov, I really appreciate your taking the time to reply.

Oddly enough, transmitting the files is not the issue, but actual rip time (as reported by the printer).  The overall file size may be as much an issue as the fonts specifically, but we were seeing many times more ppm when we suppressed the embedding of fonts into the stream.

The device is a Xerox 6155.

We are seeking performance relief in other ways.  One promising avenue is to divide the stream up into smaller chunks (on document boundaries, of course).  Three batches of 4000 pp. seems to do quite a lot better than one of 12000 pp.

Still, those two characters are pretty common, and I'm surprised that Distiller doesn't render them.  Indeed, I dare anyone to do any volume of Word authoring, without producing one or both of these characters.

Dov Isaacs
Legend
June 20, 2018

In terms of performance, are you dealing with many exceptionally short PostScript jobs each a few pages or jobs with very many pages? In the former situation, I could see that perhaps your performance might suffer if every job downloaded the same font over and over. In the latter case of very long jobs in which the same font is used over and over, it should make no difference whatsoever whether the font was resident or downloaded due to caching in the RIP. When one is talking about a batch of three 4000 page jobs versus one 12000 page job, I suspect something else other than font rendering is at work here. The Xerox DocuTech 6155 is a fairly old printer. I don't know what type of engineering support Xerox can give you, but the symptoms you are describing don't fit those of Adobe PostScript printers in general. The performance problems sound more like a page pipeline issue than anything to do with PostScript interpretation itself.

There is no question that en dash and em dash characters are commonly used in any type of document. The fact is though that PostScript relies on the input stream's font and encoding to match that of the resident font if that is what you are actually doing. Thus, if let's say a Word document actually uses Adobe's Type 1 Helvetica font and submits a print job to an Adobe PostScript device that has that as a resident font, there should be no problem. On the other hand, if Arial was used in the original document and expects the printer resident Helvetica to fully match the character set and all encodings, all bets might be off.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)