• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

em/en dash characters

Community Beginner ,
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

A vendor of ours creates documents from Word inputs, and emits postscript.

By default, they embed their fonts into the postscript.  But they offer the option not to do so, relying on printer-resident fonts, for better performance.

When we try this, two characters are printed as empty square boxes:  the em dash and the en dash.

The PDF we get from Distiller for that postscript input looks the same: boxes for dashes.

Looking into the postscript, I see that these two characters are delivered as octals, \226 and \227.

They are right in the middle of text.  I.e., "THIS - THAT" might be (THIS \226 THAT)

Is this wrong?  Is there an encoding that Distiller would interpret rightly?

TOPICS
Programming

Views

1.8K

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines

correct answers 1 Correct answer

Community Beginner , Jun 21, 2018 Jun 21, 2018

I know what you are thinking: will this guy never say "answered" and go away?  But bear with me for a moment.

I would embrace the "old printer" answer, but for two things.  First, we have another application which prepares ps input for the same printer, and gets the en/em dashes printed just fine.  Granted, this application uses a different font - Helvetica.  And I cannot test that in my Word-based application, because we don't have Helvetica on our laptops.  Something to do with royalties.

I've l

...

Votes

Translate

Translate
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

The concept of using “printer-resident fonts” and that use of such fonts improves performance dates back to the late 1980s and early 1990s when communications with a PostScript printer was primarily with slow speed serial ports. Font data took too much communications bandwidth. But with the advent of high speed IEEE 1284 parallel ports, USB 2 ports, and Ethernet support on such printers, there is really absolutely nothing to be gained in terms of printing performance by relying on printer resident fonts.

You should also keep in mind that the printer resident fonts, at least for Adobe PostScript printers, are for the most part very old Type 1 versions of Helvetica, Times, etc. with very limited character sets that may be encoded very differently from the fonts resident on you host computer systems. This really becomes a problem when trying to use printer-resident Helvetica and Times as substitutes for Arial and Times New Roman used with documents on a host computer. Those differences in character sets and encodings are what are causing the problems.

You are best off to configure your drivers and applications to always use the host-based fonts.

And of course, when producing PDF from Office applications, you are best off using Acrobat's PDFMakers that generate PDF directly without using any intermediate PostScript.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

Dov, I really appreciate your taking the time to reply.

Oddly enough, transmitting the files is not the issue, but actual rip time (as reported by the printer).  The overall file size may be as much an issue as the fonts specifically, but we were seeing many times more ppm when we suppressed the embedding of fonts into the stream.

The device is a Xerox 6155.

We are seeking performance relief in other ways.  One promising avenue is to divide the stream up into smaller chunks (on document boundaries, of course).  Three batches of 4000 pp. seems to do quite a lot better than one of 12000 pp.

Still, those two characters are pretty common, and I'm surprised that Distiller doesn't render them.  Indeed, I dare anyone to do any volume of Word authoring, without producing one or both of these characters.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

In terms of performance, are you dealing with many exceptionally short PostScript jobs each a few pages or jobs with very many pages? In the former situation, I could see that perhaps your performance might suffer if every job downloaded the same font over and over. In the latter case of very long jobs in which the same font is used over and over, it should make no difference whatsoever whether the font was resident or downloaded due to caching in the RIP. When one is talking about a batch of three 4000 page jobs versus one 12000 page job, I suspect something else other than font rendering is at work here. The Xerox DocuTech 6155 is a fairly old printer. I don't know what type of engineering support Xerox can give you, but the symptoms you are describing don't fit those of Adobe PostScript printers in general. The performance problems sound more like a page pipeline issue than anything to do with PostScript interpretation itself.

There is no question that en dash and em dash characters are commonly used in any type of document. The fact is though that PostScript relies on the input stream's font and encoding to match that of the resident font if that is what you are actually doing. Thus, if let's say a Word document actually uses Adobe's Type 1 Helvetica font and submits a print job to an Adobe PostScript device that has that as a resident font, there should be no problem. On the other hand, if Arial was used in the original document and expects the printer resident Helvetica to fully match the character set and all encodings, all bets might be off.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

I think you are right in suggesting that, when we have severe performance issues, it is a big collection of small documents.  In this particular work flow, I think the application doesn't know it is handling many documents in the same way, and thus the fonts are defined again and again.  We concatenate them at some stage.  I will look into that further.

As to the fonts, when using this "Do not embed fonts" feature, we give a font name from the PPD file, that the app uses when switching fonts.  For example, we might say "map Arial to ArialMT" (one of the PPD fonts).  This should help the actual  printer, though not the Distiller, I suppose.

Like this discussion, our team is pursuing dual tracks: why doesn't this feature work, and in what other ways can we solve the performance issue?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Jun 20, 2018 Jun 20, 2018

Copy link to clipboard

Copied

The printer resident Arial font of the base 136 resident Adobe PostScript 3 printers doesn't even begin to match the Arial font on modern Windows systems either in character set or encoding. The printer resident font dates from approximately 1997 and has 256 glyphs or less. The current Arial has over 4600 glyphs and Unicode encoding.

          - Dov

- Dov Isaacs, former Adobe Principal Scientist (April 30, 1990 - May 30, 2021)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 21, 2018 Jun 21, 2018

Copy link to clipboard

Copied

I know what you are thinking: will this guy never say "answered" and go away?  But bear with me for a moment.

I would embrace the "old printer" answer, but for two things.  First, we have another application which prepares ps input for the same printer, and gets the en/em dashes printed just fine.  Granted, this application uses a different font - Helvetica.  And I cannot test that in my Word-based application, because we don't have Helvetica on our laptops.  Something to do with royalties.

I've looked at the postscript produced by the other app, and it is quite different from ours.  But it includes <96>, which is the hexadecimal version of \226.  Even more frustrating, Distiller gives us a PDF that also displays the right dashes!  How is that an "old printer" issue?

In case you are counting, objection 1 = same printer works with different app; objection 2 = if it's printer related, why does Distiller track that printer so closely.

Your last remarks focus on Arial and Times - perhaps Helvetica has a different history?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jun 24, 2018 Jun 24, 2018

Copy link to clipboard

Copied

The thing is, if you don't include fonts in the PostScript, it will use fonts on your system. Now, whether you get these characters is going to depend on two things

1. Whether the font INCLUDES these characters at all

2. Whether the font is in the right order (with these characters at the right place in the grid). This is called the font "encoding". One font may have the character at 226, and another not. A PostScript program COULD re-encode the font, but I bet this one doesn't, or if it does, case 1 applies.

The printer may have a font with one encoding and the system a different one.

This is a constant risk of using system fonts, so just don't !

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 25, 2018 Jun 25, 2018

Copy link to clipboard

Copied

OK, thanks again.  Embedding is the safe way, and we will have to find ways to lessen the impact of frequent reloading of the same font.  This is a result of a work flow, in which individual short documents are later concatenated, so that you have a new job and font load every few pages.  There isn't always a good way around this sort of thing.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Jun 25, 2018 Jun 25, 2018

Copy link to clipboard

Copied

What impact do you see? Does it affect the PDF, the distilling, or the PostScript generation?

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jun 25, 2018 Jun 25, 2018

Copy link to clipboard

Copied

LATEST

Thanks for your interest!

If we continue telling the package to embed fonts, we get grotesque increases in print time.  This is a work flow problem, not easily addressed.  The way we use the package results in many individual PS files that are then concatenated through scripts.  So the printer sees "load fonts", "print a few pages", "load the same fonts" ... we could look at a writing a post processing program, but we really try to stay away from that sort of thing.

If we say "do not embed", then we don't see en dashes or em dashes, either on paper, or in a distilled PDF.  Instead there is an empty box, no doubt signifying that the character can't be printed.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines