Skip to main content
Participating Frequently
July 18, 2017
Answered

HTML/XML escaped characters

  • July 18, 2017
  • 2 replies
  • 1961 views

Is there a known way for Frame to handle HTML/XML character references ('>' and that sort of escape sequence)?

This topic has been closed for replies.
Correct answer Lynne A. Price

Matt,

Thank you. I'm reading .dita files. For the most part, this works fine, but not for special characters other than the XML-standard &, >, and so on.

I just tried entering dashes using the Unicode form, ߞ, and Frame replaces these with question marks.

Any ideas?

–Randy


Randy,

You referred to "character references" in the first message in this thread, but the examples you mention include:

          –

          &

          ߞ

Actually, only the last one is a character reference (the # is important). The first two are entity references. SGML and XML entity references can be used for a variety of purposes including special characters. Five entities for special characters are built into XML. They provide an easy method of entering data characters that would otherwise be interpreted as markup. These pre-defined entities are amp, lt, gt, quot, and apos. References to any other entity can only be used in a document that includes a DTD and then the DTD must declare the entity. The first thing to check when an entity reference such as – fails is that the entity is declared in the DTD. Note that an entity used in a DTD an be declared in an external entity such as a separate file that itself declares other entities.

As far as ߞ, I suspect that the character reference you wanted is –. The number in the first one is the decimal number 2014. The x in the second one indicates a hexadecimal number. And Unicode hex 2014 is the Unicode number for an em dash. Thus, if you had used —, your document would have contained an em dash. You could also have entered the character number in decimal. — would produce the same result. Since you mentioned –, though, you may want an en dash instead of em dash. The Unicode character number for en dash is one less than that for em dash, so it can be entered with 𢀓 or –.

   --Lynne

2 replies

Matt-Tech Comm Tools
Community Expert
Community Expert
July 24, 2017

If there are Unicode characters, are you using a Unicode font?

If so, it sounds like you need to configure your structured application to allow for those characters in the XML Entities.

-Matt Sullivan, FrameMaker Course Creator, Author, Trainer, Consultant
randyc_sfAuthor
Participating Frequently
July 24, 2017

Matt,

A valid question. I'm using Calibri, which does support Unicode AFAIK.

Maybe a little more Frame configuration is required, as you suggest. How would I do that? I tried a quick search of the Frame help, but found no answers.

Thank you again,

–Randy

Matt-Tech Comm Tools
Community Expert
Community Expert
July 25, 2017

Check out http://www.techcommtools.com/updated-framemaker-12-structured-developer-guides/

for some info on setting up structured FrameMaker.  In Fm 2017, all the things you need are under the Structure menu, but you may need someone else to help you if you aren't used to editing the file at Structure>Application Definition>Edit Global Application Definitions.

-Matt Sullivan, FrameMaker Course Creator, Author, Trainer, Consultant
Bob_Niland
Community Expert
Community Expert
July 19, 2017

What version of FM?
What downstream converter (e.g. RH)?

Just W3C mark-up characters, or...
What script sets? (e.g. all Unicode above U+00FF)
What target markup specification?
Is it necessary that named Entities be generated, or will dec/hex numbered Entities suffice?

I work with XHTML 1.0 quite a bit, and the support for non-ASCII Entities there is appalling.
It does at least include >
FM is not my tool of choice for generating such content.

randyc_sfAuthor
Participating Frequently
July 19, 2017

Bob, thanks.

I'm using FrameMaker 2017.
The character in question was '–'. This might not be standard XML, but all the other tools I have process it OK. (I could use Unicode, but that's less familiar/readable.) (P.S. Agreed about >!)
Frame simply coughs up an error and stops reading the file when it gets to the &-string (so downstream tools are irrelevant, at this point).

I also agree that Frame might not be the best tool in this context. It's what's convenient for the time being.

Matt-Tech Comm Tools
Community Expert
Community Expert
July 20, 2017

Randy, are you opening structured documents into structured FrameMaker?

If so, you can configure your XML or SGML application to do whatever you like, including treating your entities in the manner you require.

-Matt Sullivan, FrameMaker Course Creator, Author, Trainer, Consultant