Highlighted

How to unembed Arial?

Engaged ,
May 30, 2019

Copy link to clipboard

Copied

I've been trying to optimise the basic (text only) PDF version of my CV so I'm playing with the Optimise settings.

Audit Space shows it's the font that's taking up almost all the file size:

Screenshot 2019-05-30 at 13.11.35.png

But the font (I've only used Arial) does not show in the Unembed options:

Screenshot 2019-05-30 at 13.11.22.png

Anyone know how I can find and remove the font?

I figure everyone's got Arial on their computer anyway so removing it from the PDF should be OK.

Thanks.

No, it really isn't OK! 

There are many versions of Arial with different glyph complements and encodings. Substitute a version of Arial that doesn't match what you composed with and you may end up with gibberish. And no, there are mobile devices that don't have Arial - they substitute something else that may or may not match what you have.

Acrobat doesn't let you unembed if based on the encoding of the embedded font and/or characters used, there is a concern that you will end up with a useless PDF file.

Quite frankly, if we had to do it over again here at Adobe, we likely would never have allowed fonts only by reference (i.e., unembedded fonts). You have no idea the problems that are encountered with such situations that are reported by our users.

          - Dov

PS:     97K for a PDF file isn't that big and if you are in fact using 83K of that for fonts, it is probably indicative of use of a very wide variety of glyphs, some symbolic, and/or styles such as italic, bold, and bold italic. No one will reject your “CV” / résumé because it is 97K byes in length!

Topics

Create PDFs, Macintosh, Windows

Views

384

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

How to unembed Arial?

Engaged ,
May 30, 2019

Copy link to clipboard

Copied

I've been trying to optimise the basic (text only) PDF version of my CV so I'm playing with the Optimise settings.

Audit Space shows it's the font that's taking up almost all the file size:

Screenshot 2019-05-30 at 13.11.35.png

But the font (I've only used Arial) does not show in the Unembed options:

Screenshot 2019-05-30 at 13.11.22.png

Anyone know how I can find and remove the font?

I figure everyone's got Arial on their computer anyway so removing it from the PDF should be OK.

Thanks.

No, it really isn't OK! 

There are many versions of Arial with different glyph complements and encodings. Substitute a version of Arial that doesn't match what you composed with and you may end up with gibberish. And no, there are mobile devices that don't have Arial - they substitute something else that may or may not match what you have.

Acrobat doesn't let you unembed if based on the encoding of the embedded font and/or characters used, there is a concern that you will end up with a useless PDF file.

Quite frankly, if we had to do it over again here at Adobe, we likely would never have allowed fonts only by reference (i.e., unembedded fonts). You have no idea the problems that are encountered with such situations that are reported by our users.

          - Dov

PS:     97K for a PDF file isn't that big and if you are in fact using 83K of that for fonts, it is probably indicative of use of a very wide variety of glyphs, some symbolic, and/or styles such as italic, bold, and bold italic. No one will reject your “CV” / résumé because it is 97K byes in length!

Topics

Create PDFs, Macintosh, Windows

Views

385

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Adobe Employee ,
May 30, 2019

Copy link to clipboard

Copied

No, it really isn't OK! 

There are many versions of Arial with different glyph complements and encodings. Substitute a version of Arial that doesn't match what you composed with and you may end up with gibberish. And no, there are mobile devices that don't have Arial - they substitute something else that may or may not match what you have.

Acrobat doesn't let you unembed if based on the encoding of the embedded font and/or characters used, there is a concern that you will end up with a useless PDF file.

Quite frankly, if we had to do it over again here at Adobe, we likely would never have allowed fonts only by reference (i.e., unembedded fonts). You have no idea the problems that are encountered with such situations that are reported by our users.

          - Dov

PS:     97K for a PDF file isn't that big and if you are in fact using 83K of that for fonts, it is probably indicative of use of a very wide variety of glyphs, some symbolic, and/or styles such as italic, bold, and bold italic. No one will reject your “CV” / résumé because it is 97K byes in length!

- Dov Isaacs, Principal Scientist, Adobe

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Engaged ,
May 31, 2019

Copy link to clipboard

Copied

Thanks Dov, that's an interesting and valuable response. I've never had an issue where a different variety of Arial caused a layout changes, text reflow or total gibberish but if it's a concern I will be careful to avoid this in future.

You're right 97K is totally fine- I was just tired and going "number blind", my apologies.

Thanks for the help and advice!

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Community Beginner ,
Jun 22, 2020

Copy link to clipboard

Copied

Hello Dov,

 

I have nearly the exact same issue, except my PDF is 1.1mb large, and this document will be filled out and archive thousands upon thousands of times, meaning that there will be gigabytes of excess storage needed.

PDF audit.PNG

 

Here you can see that 98% of the file is taken up by fonts (1,119kb) and yet in the PDF Optimize window, there are no fonts that are available to unembed, nor are there any fonts available to embed either. What is going on, and how can we reduce the file size?

 

Thank you,

Michael

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Adobe Community Professional ,
Jun 22, 2020

Copy link to clipboard

Copied

What fonts does you use in the document?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Most Valuable Participant ,
Jun 23, 2020

Copy link to clipboard

Copied

Do you use fillable form fields? Their fonts are embedded and cannot be removed. If you do, what fonts do you use?

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Adobe Employee ,
Jun 23, 2020

Copy link to clipboard

Copied

The response of Test Screen Name is 100% on the money. When you create forms fields using a specific font, the assumption is that recipients need that font for purposes of filling out the form and as such, that font is embedded completely within the PDF file such that the form can be filled out without concern as to a missing font or missing glyphs if the font is subsetted. The reason that the font doesn't appear in PDF Optimizer's choice of fonts to unembed is because forms fields are not the same as regular text in a PDF file. They are in fact PDF annotations!

 

You then might ask, Arial is a “system font” so why install it? The answer is two-fold:   (1) You cannot assume that every system indeed has Arial and (2)  There are a good number of different versions of Arial that Microsoft, Apple, and Monotype have released over the years with differing character sets and encodings. Embedding the font you defined the form with is the only way to assure that the form is filled out with font you designed it with.

 

And you may also wonder why Arial is so “big” – why does it take so much space? The answer is very simple. A good number of the so-called “system fonts” shipped by Microsoft since the early 1990s have continually grown in size over the years such that these fonts can support a wide range of languages and the glyphs required to support those languages. For example, the most recent version of Arial shipped with Windows 10 (at least prior to the Windows 10 2004 version which only beginning to roll out to users), version 7.0 contains 4503 glyph definitions and is 1,036,584 bytes in size. It claims support for the Basic Latin, Latin-1 Supplement, Latin Extended-A, Latin Extended-B, IP Extensions, Spacing Modifier Letters, Combining Diacritical Marks, Greek & Coptic, Cyrillic, Cyrillic Supplement, Armenian, Hebrew, Arabic, Arabic Supplement, Phonetic Extensions, Phonetic Extensions Supplement, Combining Diacritical Marks Supplement, Latin Extended Additional, Greek Extended, General Punctuation, Superscripts & Subscripts, Currency Symbols, Letterlike Symbols, Number Forms, Mathematical Operators, Box Drawing, Block Elements, Geometric shapes, Miscellaneous Symbols, Miscellaneous Mathematical Symbols-A, Miscellaneous Mathematical Symbols-B, Latin Extended-C, Cyrillic Extended-A, Supplemental Punctuation, Cyrillic Extended-B, Modified Tone letters, Latin Extended-D, Alphabetic Presentation Forms, Arabic Presentation Forms-A, Combining Half Marks, and Arabic Presentation Forms-B Unicode character set ranges. (Surprisingly, it doesn't Koosbanian! 😮) The other styles of Arial as well as other Windows “system fonts” such as Times New Roman, Courier New, and Calibri (and their various styles) are not much different.

 

Thus, when selecting a font for a forms field, your choice is for either versatility and internationalization capability with attendant large fonts or for less flexibility (supporting only Western Latin character sets for example) and much smaller embedded fonts and thus smaller PDF file sizes. Your choice.

 

- Dov Isaacs, Principal Scientist, Adobe

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Community Beginner ,
Jun 24, 2020

Copy link to clipboard

Copied

Hello Dov,

 

This is an excellent and comprehensive answer! I do have several fields on the PDF. Thank you and Test_Screen_Name for discovering the issue.

 

Here's the thing though, the PDF that my company has will be filled out many thousands of times, but it will be filled out automatically (specifically, using WebSupergoo's ABCpdf v10) and flattened before being sent to the end user and archived.

 

Which then leads to another question: After the PDF form is filled out, is there a programmtic way to subset the embedded font to only retain the characters being used in the final PDF?

 

Tangental question: our company sometimes sends out forms for our customers to fill out. You mentioned that we have a choice for either internationalization capability or for less flexibility (supporting only Western Latin character sets) and much smaller smaller PDF file sizes.
We expect our forms to be filled out with only Western Latin characters; is there a way to choose to embed only the Western Latin character set for a font such as Arial?

 

Also, thanks for teaching me what Koozebanian is! Never knew that those mops had a formal language!

 

- Michael

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Adobe Employee ,
Jun 25, 2020

Copy link to clipboard

Copied

Michael,

 

For better or worse, I have no idea what WebSupergoo's ABDpdf is although I suspect that it is some type of PDF library.

 

To sort of answer your questions.

 

In terms of “flattening” the PDF I assume that since PDF forms fields are technically annotations, you are taking the rendition of each form field and placing same in the pages' object streams, correct? I'll assume that is the case. At that point, conceivably a very intelligent and well implemented PDF library could subset a font to only have the glyphs referenced by the document. No such feature is user-accessible within Acrobat Pro though.

 

In terms of embedding only a subset of characters that will be expected to be used in forms fields, that is conceivably possible, but is not a function supported by Acrobat.

 

PS:  You really know what I was talking about in terms of “Koozebanian” (or Koosbanian)? They first appeared on the Muppet Show in the mid-1970s when I was in grad school. My best guess is that their “language” is written right-bottom-corner of page to the left-top-corner of page, something not at all supported by any current layout or display products. But I've been able to use that “language” name for years without offending anyone. You are the first to catch-on! 😃

 

- Dov Isaacs, Principal Scientist, Adobe

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Community Beginner ,
Jun 29, 2020

Copy link to clipboard

Copied

Thank you Dov for the continued help

 

I've been trawling through the documentation for ABCpdf, and there are indeed references to functions for both embedding fonts and subsetting them. I'll continue from there with consulting their support forums if need be.

 

It's also good to know that when we send out a PDF with fields to be filled to our users, that the PDF size will necessarily be high, and that 1.1mb is not unexpected for a PDF with empty forms.

 

And yup, I initially thought that Koosbanians were the mop-like Muppets that spoke in Meeps and telephone sounds, but it turns out they're a diferent alien Muppet. Multiple species of Muppet aliens! What a world we live in 😊

 

Thanks again,

- Michael

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Most Valuable Participant ,
Jun 30, 2020

Copy link to clipboard

Copied

I haven't read all of this, but I would like to mention something that might have been missed: there is never a good reason to use Arial a font. All it does it make the file bigger, for zero benefit. Use Helvetica (from the top of the list). This is magic, and always works. No, it doesn't matter whether you own a Helvetica font. No, nothing will be embedded. Yes, it will work everywhere. Yes, it looks like Arial and will probably actually be Arial on Windows and Mac.

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Resources
One Stop Solution for Acrobat
Edit a PDF