Copy link to clipboard
Copied
I have an sgm file which contains special characters in different languages.
<!DOCTYPE MANUAL PUBLIC "-//SWE-XXX//DTD XXX MANUAL-DTD 2.0//EN">
<MANUAL LANG="CS">
<TITEL>
Polish characters: óê¿ñæÑÆÊ¥Ó£
Czech characters: éìóïáøèíùòúÒÌÉÓÚÙÈØÍÁÏ aacute: á</TITEL>
</MANUAL>
Is there any way to insert special character directly in sgm file, so that it won't have to be interpreted by isoents mapping rules?
isoent.rwr interpretes oacute as:
entity "oacute" is fm char 0x00F3;
But I would like to use 0x00F3 directly in fm file, so that if any additional character is needed - I won't have to update isoent files
I've tested already:
ó
&x00F3;
&0x00F3;
but with no luck
Is there anyone who could help me with it?
I will really appreciate your help
/Joanna
Copy link to clipboard
Copied
Joanna,
Your SGML editor should do that for you.
Otherwise you observe one of the stumbling blocks of SGML compared to XML, with the latter you can simply use Unicode characters (assuming the encoding is given as "UTF-8"). What is the character encoding stated at the top of your SGML file?
- Michael
Copy link to clipboard
Copied
Hi Michael. Thank you for your reply.
There is no declaration within sgm file itself - but while opening the file I use sgml application definition with the following settings:
Default API client: FmTranslator
SGML character encoding: ISO Latin1
XML character encoding: UTF-8
Namespace: Enable
CSS2 Preferences:
Generate CSS2: Disable
Add Fm CSS Attribute To XML: Disable
Retain Stylesheet Information: Disable
Entity locations
Entity search paths
C:\Program Files\Adobe\FrameMaker9\Structure\sgml\isoents
So as you can see, character encoding is set to ISO Latin1 (there is no way to use UTF-8 encoding in sgml files)
Typing ź or ć in sgm document and opening it with framemaker sgml application - I receive: ¿æ and message: "Non-SGML character found; should have been character reference"
Everything works fine when I type f.ex.: &x016B; and insert appropriate reference lines into isolat1.rw and isolat1.ent files
But what I would like to avoid is editing those isoent files each time new character is be needed.
Copy link to clipboard
Copied
Am 16.11.2010 um 18:28 schrieb aksanna:
Typing ź or ć in sgm document and opening it with framemaker sgml application - I receive: ¿æ and message: "Non-SGML character found; should have been character reference"
I feel lucky having avoided SGML times... What editor are you using to edit the .sgm file? It looks as if this editor does not use Latin-1 encoding, otherwise it would have to complain or solve the problem by replacing the characters with the defined character references.
Best method would be to switch to XML, only the all the ease of Unicode would be at your hand.
Maybe someone with real SGML experience can chime in.
- Michael
Copy link to clipboard
Copied
I use UltraEdit, so It shouldn't be a cause of this problem.
Copy link to clipboard
Copied
I use UltraEdit, so It shouldn't be a cause of this problem.
I use UltraEdit as a programming editor, but AFAIK it is not an SGML editor in the strict sense. In the status line it tells you the encoding of the current file, if you see something like "U8-DOS" or "U-DOS" (or -UNIX) it is using UTF-8 or Unicode encoding without caring for what is declared in the file. SGML is a very strict standard with hard limitations. Basically, if you insert illegal characters in an SGML file it is no longer an SGML file.
Given the available tools, your best bet would be to switch to XML.
- Michael