Skip to main content
Harald E Brandt
Known Participant
September 19, 2008
Question

Making a valid XML DTD out of a more general EDD?

  • September 19, 2008
  • 13 replies
  • 1948 views
I have a very well-made EDD that is used for many manuals. It is somewhat largish (45 pages in 9pt Verdana) but it is very flexible and appropriate for many types of documents. It was not made with xml export as a primary goal, so, although I avoided inclusions/exclusions, there is one crucial construct (used in many places) that is not valid in xml, but is valid in sgml. The construct looks like this:
>Element (Container): MyElement

> General rule: TEXT, (Para | ListUnordered | ListOrdered)*

In a DTD, it would be:
>!ELEMENT MyElement

> (#PCDATA, (Para | ListUnordered | ListOrdered)*

In XML however, the so called "Mixed Content" does not allow such a construct.

But, as I understand from http://www.w3.org/TR/REC-xml/#sec-mixed-content, it would allow:
>!ELEMENT MyElement

> (#PCDATA | Para | ListUnordered | ListOrdered)*

but also:
>!ELEMENT MyElement

> (MyText, (Para | ListUnordered | ListOrdered)*

Quite long ago, someone on this forum suggested that I embed the TEXT into a container, arbitrarily named something like MyText, and then used that instead of the TEXT. Then the EDD would export to a valid DTD.

I am reluctant to change the EDD (for obvious reasons), so my question is:

Is it possible to use read/write rules to do this conversion, or do I really have to go to the trouble of an xslt to achieve this?

Any other ideas or suggestions?

(Purpose of XML export: the reason to export to xml is that I am thinking of doing it for translation purposes only. Since Trados is incompatible with FM8 and can't handle Unicode, an xml document would be a much safer bet that any translation agency would be able to handle with proper Unicode. It is also cheaper than translating FM files, and I would guess that the risk of the translation agency [mess]ing up the files is smaller with xml than fm files (right? wrong?). However, it is absolutely necessary to get the whole round trip to preserve exactly everything, including custom ruling of tables, positioning of graphics, equations etc etc. I have yet to see if that is possible.)

[edited by host]
This topic has been closed for replies.

13 replies

Known Participant
January 30, 2009
Harald,

You write:

For another customer, I may have a need to "interface" with xml databases to get some content into an FM file.

I had the project of creating an XSLT that converted an XML file generated by a manufacturing database, that listed the parts and subparts of the product. The person who created the XML file devised his own set of elements and attributes. I then had to create the translation to get the content into our Frame structure for a parts catalog. It was not difficult. It took a few weeks of learning XSLT, but it was straightforward. The resulting text file was only two pages.

The XSLT file is applied when you open the XML file from within Frame. This is set up in a structapp. The only additional thing needed was a read/write file that translated the table elements into Frame tables and told Frame how to divide the XML file into individual document files, that chapters of a book. The read/write file was less than 10 lines long.

Note that if you apply the XSLT outside Frame using another application, you still need a read/write file to convert the table elements correctly. So, from my experience, you can do it all using Frame.

Good luck,
Van
Harald E Brandt
Known Participant
January 30, 2009
Franz, (systecgmbhnuremberg),

First, I don't need any application at all in structapps.fm to create or validate structured documents, unless I have a need for export or import to/from xml/sgml. I.e, so far, I have no sgml application defined for these documents.

I have not looked at FrameSLT (except just now today I had a glimpse through some of its pdfs). I dismissed the whole idea of exporting to xml for translation purposes because the translation agency has finally (Oct 2008) their Trados system setup to be compatible with FM8 and Unicode. In addition, I realized that exporting to xml is most probably a bad idea since there seems to be no way of getting text line objects (for graphics) into that work flow in a sensible way so that they can be translated. (Or is there such a way?) So, the best is probably to continue to send .fm files as usual, even though they charge more for that.

Nevertheless, I have not totally dismissed the thought of (automatically) translating an EDD to become XML compliant. But even the FrameSLT seems pretty complex.

For another customer, I may have a need to "interface" with xml databases to get some content into an FM file. Maybe FrameSLT is a solution there, although I am not fully certain as to how or to what extent it makes it easier compared with using pure xslt statements in a text file and then opening the xml file with a suitable xml application that also references the xslt statements. (But now I am out on really thin ice...)

In any case, it would be interesting to know if you all shun (avoid) mixed content completely for some reasons? For authoring purposes, I think the sgml version of mixed content is "nicest" for the author, since it is very natural, so if you avoid it, I guess it must be for special reasons?
Participating Frequently
January 30, 2009
Hello Mr. Brand,<br /><br />what you are talking about is automatically wrapping <TEXT> into <MyText>, so it becomes valid XML.<br /><br />I am not sure, whether this is possible via XSL, so you might need a cutsom plugin to do it. Have you looked at FrameSLT? As I understand, it is like XSLT, but on the basis of structured documents that work as "FrameSLT-Scripts". Maybe with this you can detect those <Text> areas that need to be wrapped into <Mytext>.<br /><br />As you say that your EDD does not croak on it, this means, that you have it configured as an SGMLApplication in structapps.fm. When exporting XML you must be using an XMLApplication in structapps.fm.<br />So as for this, you could try to do the following:<br />1. Edit using your EDD and <br />a. either use .fm format rather than .xml and validate your document my using the menu entry<br />b. have an SGML-Application in structapps.fm <br />2. If you have no errors in these validations, your document is ok.<br />3. When preparing the files for the translator, assign a different structapp, which is an XMLApplication, rather than the SGMLApplication you use for editing.<br />4. This way you can discern:<br />* Error in the SGMLApplicaton => real invalidity<br />* Error in the XMLApplication, but not in the SGMLApplication =>no real invalidity<br /><br />BTW: If you say, that you need layout and everything preserved: Have you tried to send the .fm files to the translator rather than the .xml files?<br /><br />Hope this helps.<br />With kind regards,<br />Franz.<br /><br />[ excess signature removed by host ]
Known Participant
September 25, 2008
Hi again;

>XSLT can change your XML into any other conceivable text layout, not just into different XML. It could change it into a haiku if you wanted.

Well, almost. Russ, I've run accross some interesting limitations, especially in the Frame universe. But Harald, Russ is right - XSLT lets you do just about anything.

Which brings me to an idea for another philosophy paper. Given that XSLT lets any "X" become "Y", just exactly when does some arbitrary string gain meaning? Where does syntax end and semantics begin? Or is everything just syntax the way it is in XSLT?

Ah, well. I digress....

David Blyth (the UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling the Earth to go
-----------------------------------------------------------
Legend
September 24, 2008
Harald,

XSLT can change your XML into any other conceivable text layout, not just into different XML. It could change it into a haiku if you wanted. It takes a little time to get a hold of, though.

Russ
Harald E Brandt
Known Participant
September 24, 2008
>Jon said:

>If you were prepared to do this in your FrameMaker documents and change your existing documents accordingly...

That is something which is unthinkable!

It HAS to be via read/write rules and XSLT. As I understand it, xslt is able to transform from one content model to another(?)

So, if someone fluent in xslt would chip in and provide a suggestion, recommendation, or code snippets, that would be welcome.

However, there may very well be other reasons why the whole idea of xml export for a translation work flow is a bad idea; I am primarily thinking of what to do with text line objects that are used for callouts etc in anchored frames, but that is a topic that might need a separate forum thread I guess.
Known Participant
September 24, 2008
Hi Harald,

I try to avoid mixed content elements for this and other similar reasons. Too messy.

>Ok, but things like cols attribute etc are taken care of by read/write rules, right?

Yes.

> What I feared was manual edits that can be difficult to keep in sync over an extended period of time (years).

Too soon for me to tell - I'm essentially in an R&D group. I keep writing Frame applications, the engineers keep throwing them out. My employer keeps sending me paychecks, so I guess all is well.... ;)

David Blyth (the UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling The Earth to go
-----------------------------------------------------------
Known Participant
September 20, 2008
Hi Harald,

>Correct me if I am wrong, but to make it valid in xml, isn't a complete flattening the only way to achieve that?

You're right, the only way to specify mixed content in an XML DTD is the way you wrote it:
><!ELEMENT myElement (#PCDATA | element1 | element2 | element3)* >

Your second suggestion sounds like a better fit with what you want to achieve. If you were prepared to do this in your FrameMaker documents

>Element(Container): myElement

>General Rule: TextFragment, (Para | ListUnordered | ListOrdered)*, (NextBlock1 | (NextBlock2, subblock3*))*

and change your existing documents accordingly (I imagine Russ Ward's FrameSLT software could help you here to automate the retagging), you would be able to export straight to XML and export the EDD to XML without any shudder-inducing conditional text.

If you don't want that situation in FrameMaker and you want to wrap the text in an element on export, read-write rules won't help you here. Maybe an XSLT transform can do what you want, but others here would know much more about that subject. You'd still have the manual editing to do on the DTD though, if the content models are not the same in the FrameMaker and XML domains.

Jon
Harald E Brandt
Known Participant
September 20, 2008
>David said:

it's virtually impossible to keep the DTD and EDD in sync because many XML attributes (like table "cols") get mapped to Frame properties (like # of table columns). So I simply stopped worrying about it ... and keep close watch on my rules file.

Ok, but things like cols attribute etc are taken care of by read/write rules, right? As long as there are automatic translations via the rules file or xslt, I consider them as still being "in sync". What I feared was manual edits that can be difficult to keep in sync over an extended period of time (years).

Jon's suggestion with conditional text is a semiautomated way (that I think Jon, or someone else mentioned a year ago or so?). It does at least bring out the changes somewhat more explicit. However, I still shudder somewhat when faced with changing things like the following example I showed earlier (and other much more involved constructs):
>General rule: TEXT, (Para | ListUnordered | ListOrdered)*, (NextBlock1 | (NextBlock2, subblock3*))*

Correct me if I am wrong, but to make it valid in xml, isn't a complete flattening the only way to achieve that? Like this:
>General rule: (TEXT | Para | ListUnordered | ListOrdered | NextBlock1 | NextBlock2 | subblock3)*

...which I consider a pretty drastic departure from the original! Maybe it's still acceptable and safe if it is used for language translation purposes only, but I shudder somewhat (perhaps unjustified)...

The translation agency normally doesn't change the structure, but sometimes they do, also when given structured fm-files; in those cases, I have to manually fix that. The cases where that happens is when a paragraph starts with a concept/definition, and the translator needs to precede the concept with a small word (corresponding to an article such as 'The') and then inserts that small word before the paragraph boundary, which invalidates the structure. It is hard to know if such mistakes will be more or less common in an xml workflow as compared with an fm workflow.

By far a more elegant way to make a valid DTD from the EDD is to replace the TEXT token by a MyText element, and then create a MyText element that only contains the TEXT token. Wouldn't that be a completely general method? If that is the way to go, it remains to find simple automated ways to achieve that. Is it simple to do in xslt? I guess it is impossible to do it with read/write rules, right?

/Harald
Known Participant
September 19, 2008
Hi all;

>you could make the EDD rule as restrictive as you need for authoring purposes, then relax/adjust the DTD so that the XML parser stays happy.

FYI, I usually work from XML towards Frame, so sometimes find it convenient to relax the EDD not the DTD.

Either way, it's virtually impossible to keep the DTD and EDD in sync because many XML attributes (like table "cols") get mapped to Frame properties (like # of table columns). So I simply stopped worrying about it ... and keep close watch on my rules file.

Just a thought.

David Blyth (UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling The Earth to go
------------------------------------------------------------