Skip to main content
Participating Frequently
June 30, 2023
Question

Same Unicode displaying differently in PDFs created from the same file

  • June 30, 2023
  • 1 reply
  • 1882 views

Hi, 

I've encountered an issue where 2 PDFs that I have are displaying characters with the same Unicode value differently. Both of these PDFs were created from the same InDesign file- it was exported to a PDF, and then a minor change to the styling was made and it was exported to a PDF again. This change did not effect the characters that are now displaying differently, so I am confused as to what made this happen. 

I'm not sure if this is an important detail, but I noticed that the Unicode for this character in the InDesign file is U+002D and in both PDFs the Unicode is U+0171. Below are the 2 different ways that this character is being displayed.

I would really appriciate any insight into how and why this happened and how I can avoid it in the future!

This topic has been closed for replies.

1 reply

Bevi Chagnon - PubCom.com
Legend
July 1, 2023

Hi Charlotte @Charlotte30801071coj1,

This is a really baffling problem and I'm concerned that characters you inserted into an INDD layout are being changed by something. That shouldn't ever happen.

 

Maybe together we can drill down to figure out what caused the switch.

  • U+002D is indeed the standard hyphen glyph. On all QWERTY keyboards, it's the key to the right of the number zero.
  • U+0171 is a lowercase u with a double acute accent. It looks like this https://unicodeplus.com/U+0171

 

I'm concerned that what you showed doesn't look anything like a lowercase u with double acute accents. In fact, I don't recognize your version at all, but I certainly haven't memorized all 64,000 Unicode glyphs, either.

 

The difference could be that your font doesn't have the glyph you want and swapped it during the export to PDF. Usually we see wierd spacing, dots, white blocks (called tofu because they look like a block of tofu), or even smiley face characters.

 

1: What font are you using? Can you confirm that it is OpenType/Unicode? Also give us the manufacturer.

2: What operating system are you on?

3: What version of InDesign are you using? Need the build number, such as version 2023 18.4.

4. What is the real glyph that you want in the file?

5. And how did you insert it into your INDD layout?

 

Your answers will help us diagnose what's causing the problem.

--Bevi

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents ||    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |
Participating Frequently
July 10, 2023

Hi Bevi,

I apologize for the delay in my reply! A bit more background here- a coworker put together the original documents and encountered the original error back in 2021. Personally, I have been unable to recreate this error with my current set up, so I will also tell you what I am currently using, which has not had any issues, in hopes that that info may help as well.

 

- The font used  is Myriad Pro, which is OpenType and was created by Adobe and sourced from Adobe fonts.

- The operating system used when it was created was macOS 11. I am running macOS 13.4.

- InDesign 2021 16.1 was used when the error happened, I am using InDesign 2023 18.1

- The glyph that needs to be displatyed is the standard hyphen

- The glyph that errored was used as a bullet point in the INDD file. The text was selected and bullet point formatting was applied. The bullet character was added from a list of different symbols that could be chosed from in the program. Something that I noticed here that might be relevent is that it seems that InDesign has multiple bullet point options that used this glyph (all Unicode 002D) but that their GID is different. The one used in this file has the GID 369. Another interesting thing is that this glyph (again, same unicode) was used in blocks of text in between words in addition to its use as a bullet point, and those uses of the glyph didn't end up with any error.

 

I hope this information answers your questions! 

 

 

Bevi Chagnon - PubCom.com
Legend
July 12, 2023

Hi Charlotte,

GID = Glyph ID.

 

In your case, different variations of the hyphen (U+002D) were used in the document you described. All were hyphens, but with different variations / GIDs. The sample shown below are the hyphen variations in the Minion Pro font. All of the yellow highlighted glyphs are the same Unicode 002D, but with different GIDs.

 

 

Using alternate variations (same Unicode but different GID) is dicey because not all technologies can handle the variations. Also, not all fonts have variations — or the same variations — as other fonts.

 

We avoid using glyph variations in our files because there are too many cases where they fail from one program to another, or from one publishing technology to another (like PDF, EPUB, and HTML).

 

Suggestions:

  • For the bullets you described: we recommend setting a bullet Paragraph Style that uses an en-dash (U+2013)  that is available on all fonts and works across all technologies. It should be fail-proof for your project.
  • Those old projects appear to have used outdated methods and fonts, and also have some inaccurate construction in them. Ditch them because there are too many crazy things in them that will just continue to fail, and waste your time. Rescue the text by selecting the stories and File / Export to RTF.  The resulting word processing file will retain the styles, which you can either remove or keep/revise. Your choice. If these were my files, I'd bring the RTFs into Word and strip out all formatting to clean up the code, and replace any odd characters with the correct Unicode glyph (Word has a nice glyphs pallete that works like InDesign's).

 

Hope this helps. Best to you!

 

|    Bevi Chagnon   |  Designer, Trainer, & Technologist for Accessible Documents ||    PubCom |    Classes & Books for Accessible InDesign, PDFs & MS Office |