Exit
  • Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
0

Making a valid XML DTD out of a more general EDD?

Participant ,
Sep 19, 2008 Sep 19, 2008

Copy link to clipboard

Copied

I have a very well-made EDD that is used for many manuals. It is somewhat largish (45 pages in 9pt Verdana) but it is very flexible and appropriate for many types of documents. It was not made with xml export as a primary goal, so, although I avoided inclusions/exclusions, there is one crucial construct (used in many places) that is not valid in xml, but is valid in sgml. The construct looks like this:
>Element (Container): MyElement

> General rule: TEXT, (Para | ListUnordered | ListOrdered)*

In a DTD, it would be:
>!ELEMENT MyElement

> (#PCDATA, (Para | ListUnordered | ListOrdered)*

In XML however, the so called "Mixed Content" does not allow such a construct.

But, as I understand from http://www.w3.org/TR/REC-xml/#sec-mixed-content, it would allow:
>!ELEMENT MyElement

> (#PCDATA | Para | ListUnordered | ListOrdered)*

but also:
>!ELEMENT MyElement

> (MyText, (Para | ListUnordered | ListOrdered)*

Quite long ago, someone on this forum suggested that I embed the TEXT into a container, arbitrarily named something like MyText, and then used that instead of the TEXT. Then the EDD would export to a valid DTD.

I am reluctant to change the EDD (for obvious reasons), so my question is:

Is it possible to use read/write rules to do this conversion, or do I really have to go to the trouble of an xslt to achieve this?

Any other ideas or suggestions?

(Purpose of XML export: the reason to export to xml is that I am thinking of doing it for translation purposes only. Since Trados is incompatible with FM8 and can't handle Unicode, an xml document would be a much safer bet that any translation agency would be able to handle with proper Unicode. It is also cheaper than translating FM files, and I would guess that the risk of the translation agency [mess]ing up the files is smaller with xml than fm files (right? wrong?). However, it is absolutely necessary to get the whole round trip to preserve exactly everything, including custom ruling of tables, positioning of graphics, equations etc etc. I have yet to see if that is possible.)

[edited by host]
TOPICS
Structured

Views

1.8K
Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Sep 19, 2008 Sep 19, 2008

Copy link to clipboard

Copied

Hi Harald,

Did you really mean to use the "f" word? Best to avoid that here.

For your DTD problem, you have the previously-stated options, and perhaps one more... There is no reason that your DTD general rule has to match the EDD rule. That is, you could make the EDD rule as restrictive as you need for authoring purposes, then relax/adjust the DTD so that the XML parser stays happy. If you are not using the XML and DTD for authoring purposes, it really doesn't matter what the DTD says, as long as the markup is valid.

You do mention translation, though... if that is considered "authoring" on that end where the structure tree might be altered, then this suggestion is not valid. You'll simply need to comply with the XML spec somehow, as Xerces (Frame's XML parser) interprets.

Russ

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Sep 19, 2008 Sep 19, 2008

Copy link to clipboard

Copied

Hi Russ,

(f-word or not -- same thing... 'messing up' is a sufficient surrogate.)

Ok, I take your point of manually relaxing the DTD. The "low-tech way", but it seems a bit dangerous though not to keep them in "sync", since, after all, the EDD do get some updates sometimes.

The example I gave is a strong simplification. In practice, the rules I have are more complex, like the following example:
> General rule: TEXT, (Para | ListUnordered | ListOrdered)*, (NextBlock1 | (NextBlock2, subblock3*))*

The trouble is that I am not able to relax that in a suitable xml-valid way without departing wildly from the EDD... Maybe there is a way, but I just don't see how.

The best would be an automated translation, but I don't know how to do that either. If read/write can do it that would be excellent, but I suspect that xslt is necessary(?)

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 19, 2008 Sep 19, 2008

Copy link to clipboard

Copied

Hi Harald,

An alternative to manually relaxing your DTD, is to use conditional text in your EDD. For each GeneralRule element that need to export to the DTD differently, you have two copies, one tagged with the condition FrameMaker (or whatever name you want) and the other with XML (or whatever you want).

So you would have:

Element (Container): MyElement
General rule: <TEXT>, (Para | ListUnordered | ListOrdered)*
General rule: (<TEXT> | Para | ListUnordered | ListOrdered)*

When you're in FrameMaker 'mode', you have the EDD set to show the FrameMaker condition and hide the XML condition. When you want to export a DTD, show the XML condition and hide the FrameMaker condition, which will produce a DTD with an XML-legal content model. When you're editing content models, you can show all conditions so that you can see both General Rules side by side (which would of course make the EDD temporarily invalid.

You can also use this method for other purposes, such as adding whole elements to your DTD which don't exist in the FrameMaker world, for instance adding colspec and spanspec elements to your DTD for CALS tables.

Jon

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 19, 2008 Sep 19, 2008

Copy link to clipboard

Copied

Hi all;

>you could make the EDD rule as restrictive as you need for authoring purposes, then relax/adjust the DTD so that the XML parser stays happy.

FYI, I usually work from XML towards Frame, so sometimes find it convenient to relax the EDD not the DTD.

Either way, it's virtually impossible to keep the DTD and EDD in sync because many XML attributes (like table "cols") get mapped to Frame properties (like # of table columns). So I simply stopped worrying about it ... and keep close watch on my rules file.

Just a thought.

David Blyth (UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling The Earth to go
------------------------------------------------------------

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Sep 20, 2008 Sep 20, 2008

Copy link to clipboard

Copied

>David said:

it's virtually impossible to keep the DTD and EDD in sync because many XML attributes (like table "cols") get mapped to Frame properties (like # of table columns). So I simply stopped worrying about it ... and keep close watch on my rules file.

Ok, but things like cols attribute etc are taken care of by read/write rules, right? As long as there are automatic translations via the rules file or xslt, I consider them as still being "in sync". What I feared was manual edits that can be difficult to keep in sync over an extended period of time (years).

Jon's suggestion with conditional text is a semiautomated way (that I think Jon, or someone else mentioned a year ago or so?). It does at least bring out the changes somewhat more explicit. However, I still shudder somewhat when faced with changing things like the following example I showed earlier (and other much more involved constructs):
>General rule: TEXT, (Para | ListUnordered | ListOrdered)*, (NextBlock1 | (NextBlock2, subblock3*))*

Correct me if I am wrong, but to make it valid in xml, isn't a complete flattening the only way to achieve that? Like this:
>General rule: (TEXT | Para | ListUnordered | ListOrdered | NextBlock1 | NextBlock2 | subblock3)*

...which I consider a pretty drastic departure from the original! Maybe it's still acceptable and safe if it is used for language translation purposes only, but I shudder somewhat (perhaps unjustified)...

The translation agency normally doesn't change the structure, but sometimes they do, also when given structured fm-files; in those cases, I have to manually fix that. The cases where that happens is when a paragraph starts with a concept/definition, and the translator needs to precede the concept with a small word (corresponding to an article such as 'The') and then inserts that small word before the paragraph boundary, which invalidates the structure. It is hard to know if such mistakes will be more or less common in an xml workflow as compared with an fm workflow.

By far a more elegant way to make a valid DTD from the EDD is to replace the TEXT token by a MyText element, and then create a MyText element that only contains the TEXT token. Wouldn't that be a completely general method? If that is the way to go, it remains to find simple automated ways to achieve that. Is it simple to do in xslt? I guess it is impossible to do it with read/write rules, right?

/Harald

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 20, 2008 Sep 20, 2008

Copy link to clipboard

Copied

Hi Harald,

>Correct me if I am wrong, but to make it valid in xml, isn't a complete flattening the only way to achieve that?

You're right, the only way to specify mixed content in an XML DTD is the way you wrote it:
><!ELEMENT myElement (#PCDATA | element1 | element2 | element3)* >

Your second suggestion sounds like a better fit with what you want to achieve. If you were prepared to do this in your FrameMaker documents

>Element(Container): myElement

>General Rule: TextFragment, (Para | ListUnordered | ListOrdered)*, (NextBlock1 | (NextBlock2, subblock3*))*

and change your existing documents accordingly (I imagine Russ Ward's FrameSLT software could help you here to automate the retagging), you would be able to export straight to XML and export the EDD to XML without any shudder-inducing conditional text.

If you don't want that situation in FrameMaker and you want to wrap the text in an element on export, read-write rules won't help you here. Maybe an XSLT transform can do what you want, but others here would know much more about that subject. You'd still have the manual editing to do on the DTD though, if the content models are not the same in the FrameMaker and XML domains.

Jon

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 23, 2008 Sep 23, 2008

Copy link to clipboard

Copied

Hi Harald,

I try to avoid mixed content elements for this and other similar reasons. Too messy.

>Ok, but things like cols attribute etc are taken care of by read/write rules, right?

Yes.

> What I feared was manual edits that can be difficult to keep in sync over an extended period of time (years).

Too soon for me to tell - I'm essentially in an R&D group. I keep writing Frame applications, the engineers keep throwing them out. My employer keeps sending me paychecks, so I guess all is well.... ;)

David Blyth (the UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling The Earth to go
-----------------------------------------------------------

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Sep 24, 2008 Sep 24, 2008

Copy link to clipboard

Copied

>Jon said:

>If you were prepared to do this in your FrameMaker documents and change your existing documents accordingly...

That is something which is unthinkable!

It HAS to be via read/write rules and XSLT. As I understand it, xslt is able to transform from one content model to another(?)

So, if someone fluent in xslt would chip in and provide a suggestion, recommendation, or code snippets, that would be welcome.

However, there may very well be other reasons why the whole idea of xml export for a translation work flow is a bad idea; I am primarily thinking of what to do with text line objects that are used for callouts etc in anchored frames, but that is a topic that might need a separate forum thread I guess.

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Mentor ,
Sep 24, 2008 Sep 24, 2008

Copy link to clipboard

Copied

Harald,

XSLT can change your XML into any other conceivable text layout, not just into different XML. It could change it into a haiku if you wanted. It takes a little time to get a hold of, though.

Russ

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Sep 25, 2008 Sep 25, 2008

Copy link to clipboard

Copied

Hi again;

>XSLT can change your XML into any other conceivable text layout, not just into different XML. It could change it into a haiku if you wanted.

Well, almost. Russ, I've run accross some interesting limitations, especially in the Frame universe. But Harald, Russ is right - XSLT lets you do just about anything.

Which brings me to an idea for another philosophy paper. Given that XSLT lets any "X" become "Y", just exactly when does some arbitrary string gain meaning? Where does syntax end and semantics begin? Or is everything just syntax the way it is in XSLT?

Ah, well. I digress....

David Blyth (the UNIX and vi dinosaur)
Staff Technical Writer
QCT Division
QUALCOMM - Standard Disclaimers Apply

Only 149,999,950 more years of Ruling the Earth to go
-----------------------------------------------------------

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Jan 30, 2009 Jan 30, 2009

Copy link to clipboard

Copied

Hello Mr. Brand,<br /><br />what you are talking about is automatically wrapping <TEXT> into <MyText>, so it becomes valid XML.<br /><br />I am not sure, whether this is possible via XSL, so you might need a cutsom plugin to do it. Have you looked at FrameSLT? As I understand, it is like XSLT, but on the basis of structured documents that work as "FrameSLT-Scripts". Maybe with this you can detect those <Text> areas that need to be wrapped into <Mytext>.<br /><br />As you say that your EDD does not croak on it, this means, that you have it configured as an SGMLApplication in structapps.fm. When exporting XML you must be using an XMLApplication in structapps.fm.<br />So as for this, you could try to do the following:<br />1. Edit using your EDD and <br />a. either use .fm format rather than .xml and validate your document my using the menu entry<br />b. have an SGML-Application in structapps.fm <br />2. If you have no errors in these validations, your document is ok.<br />3. When preparing the files for the translator, assign a different structapp, which is an XMLApplication, rather than the SGMLApplication you use for editing.<br />4. This way you can discern:<br />* Error in the SGMLApplicaton => real invalidity<br />* Error in the XMLApplication, but not in the SGMLApplication =>no real invalidity<br /><br />BTW: If you say, that you need layout and everything preserved: Have you tried to send the .fm files to the translator rather than the .xml files?<br /><br />Hope this helps.<br />With kind regards,<br />Franz.<br /><br />[ excess signature removed by host ]

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Participant ,
Jan 30, 2009 Jan 30, 2009

Copy link to clipboard

Copied

Franz, (systecgmbhnuremberg),

First, I don't need any application at all in structapps.fm to create or validate structured documents, unless I have a need for export or import to/from xml/sgml. I.e, so far, I have no sgml application defined for these documents.

I have not looked at FrameSLT (except just now today I had a glimpse through some of its pdfs). I dismissed the whole idea of exporting to xml for translation purposes because the translation agency has finally (Oct 2008) their Trados system setup to be compatible with FM8 and Unicode. In addition, I realized that exporting to xml is most probably a bad idea since there seems to be no way of getting text line objects (for graphics) into that work flow in a sensible way so that they can be translated. (Or is there such a way?) So, the best is probably to continue to send .fm files as usual, even though they charge more for that.

Nevertheless, I have not totally dismissed the thought of (automatically) translating an EDD to become XML compliant. But even the FrameSLT seems pretty complex.

For another customer, I may have a need to "interface" with xml databases to get some content into an FM file. Maybe FrameSLT is a solution there, although I am not fully certain as to how or to what extent it makes it easier compared with using pure xslt statements in a text file and then opening the xml file with a suitable xml application that also references the xslt statements. (But now I am out on really thin ice...)

In any case, it would be interesting to know if you all shun (avoid) mixed content completely for some reasons? For authoring purposes, I think the sgml version of mixed content is "nicest" for the author, since it is very natural, so if you avoid it, I guess it must be for special reasons?

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Beginner ,
Jan 30, 2009 Jan 30, 2009

Copy link to clipboard

Copied

LATEST
Harald,

You write:

For another customer, I may have a need to "interface" with xml databases to get some content into an FM file.

I had the project of creating an XSLT that converted an XML file generated by a manufacturing database, that listed the parts and subparts of the product. The person who created the XML file devised his own set of elements and attributes. I then had to create the translation to get the content into our Frame structure for a parts catalog. It was not difficult. It took a few weeks of learning XSLT, but it was straightforward. The resulting text file was only two pages.

The XSLT file is applied when you open the XML file from within Frame. This is set up in a structapp. The only additional thing needed was a read/write file that translated the table elements into Frame tables and told Frame how to divide the XML file into individual document files, that chapters of a book. The read/write file was less than 10 lines long.

Note that if you apply the XSLT outside Frame using another application, you still need a read/write file to convert the table elements correctly. So, from my experience, you can do it all using Frame.

Good luck,
Van

Votes

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines