Highlighted

Carriage return handling different - FM11 to FM2015

Mentor ,
Feb 26, 2016

Copy link to clipboard

Copied

Hi,

There is a change from FM11 to FM2015 with XML import regarding carriage returns. I was wondering if someone could explain it. I'm not sure if it is related to the other FM2015  whitespace discussions to date.

Here is my XML file with zero whitespace (but carriage returns), opened with the default app and no validation:

<?xml version="1.0" encoding="UTF-8"?><test>

<p>TEST</p>

<p>TEST</p><p>TEST</p>

</test>

In FM11, it opens as I would expect:

In FM2015, I get a paragraph with a single whitespace where the carriage return is:

This doesn't happen when I use a structure app with normal validation. Can anyone explain what has changed?

Thanks,

Russ

Adobe Community Professional
Correct answer by Lynne A. Price | Adobe Community Professional

Russ,

   I opened your test file in FM 2015 with the default setting (On) for RemoveExtraWhiteSpacesOnXMLImport and with it set to Off. The behavior I got was slightly different than you reported, but I believe it is correct.

  When the option is On, the document window shows 3 pgfs; the Structure View shows a <test> with 3 <p>s.

  When the option is Off, the document window shows not the 4 pgfs you reported, but 6. Three of these contain a <p> element with the text "TEST"; the others are text ranges within the <test> element, each containing a single space. All of these text ranges correspond to line breaks in the input document: after the <test> start-tag, after the end-tag for the first <p>, and after the end-tag for the last <p>.

  The XML recommendation mandates that all white space is significant. Therefore, FM is correct not to discard it. Converting the line breaks to spaces is consistent with FM's treatment of line breaks within a paragraph.

  When you open the same file using a DTD that does not permit text ranges between <p> elements, FM does not create the text ranges.

  While surprising at first, I believe the behavior is correct. I therefore didn't bother testing in FM 11 or FM 12.

  The catch is that if you want white space after xrefs treated correctly, you need to turn the option off and if you want to avoid line breaks coming in as data characters, you need to turn the option on. Solutions are:

1) Use a DTD

2) Preprocess the input to remove line breaks that format the XML

3) Preprocess the input to change a space after an xref to a character reference

4) Hope Adobe fixes the bug soon

  --Lynne

TOPICS
Structured

Views

601

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more

Carriage return handling different - FM11 to FM2015

Mentor ,
Feb 26, 2016

Copy link to clipboard

Copied

Hi,

There is a change from FM11 to FM2015 with XML import regarding carriage returns. I was wondering if someone could explain it. I'm not sure if it is related to the other FM2015  whitespace discussions to date.

Here is my XML file with zero whitespace (but carriage returns), opened with the default app and no validation:

<?xml version="1.0" encoding="UTF-8"?><test>

<p>TEST</p>

<p>TEST</p><p>TEST</p>

</test>

In FM11, it opens as I would expect:

In FM2015, I get a paragraph with a single whitespace where the carriage return is:

This doesn't happen when I use a structure app with normal validation. Can anyone explain what has changed?

Thanks,

Russ

Adobe Community Professional
Correct answer by Lynne A. Price | Adobe Community Professional

Russ,

   I opened your test file in FM 2015 with the default setting (On) for RemoveExtraWhiteSpacesOnXMLImport and with it set to Off. The behavior I got was slightly different than you reported, but I believe it is correct.

  When the option is On, the document window shows 3 pgfs; the Structure View shows a <test> with 3 <p>s.

  When the option is Off, the document window shows not the 4 pgfs you reported, but 6. Three of these contain a <p> element with the text "TEST"; the others are text ranges within the <test> element, each containing a single space. All of these text ranges correspond to line breaks in the input document: after the <test> start-tag, after the end-tag for the first <p>, and after the end-tag for the last <p>.

  The XML recommendation mandates that all white space is significant. Therefore, FM is correct not to discard it. Converting the line breaks to spaces is consistent with FM's treatment of line breaks within a paragraph.

  When you open the same file using a DTD that does not permit text ranges between <p> elements, FM does not create the text ranges.

  While surprising at first, I believe the behavior is correct. I therefore didn't bother testing in FM 11 or FM 12.

  The catch is that if you want white space after xrefs treated correctly, you need to turn the option off and if you want to avoid line breaks coming in as data characters, you need to turn the option on. Solutions are:

1) Use a DTD

2) Preprocess the input to remove line breaks that format the XML

3) Preprocess the input to change a space after an xref to a character reference

4) Hope Adobe fixes the bug soon

  --Lynne

TOPICS
Structured

Views

602

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Feb 26, 2016 0
Adobe Community Professional ,
Feb 26, 2016

Copy link to clipboard

Copied

Russ,

   I opened your test file in FM 2015 with the default setting (On) for RemoveExtraWhiteSpacesOnXMLImport and with it set to Off. The behavior I got was slightly different than you reported, but I believe it is correct.

  When the option is On, the document window shows 3 pgfs; the Structure View shows a <test> with 3 <p>s.

  When the option is Off, the document window shows not the 4 pgfs you reported, but 6. Three of these contain a <p> element with the text "TEST"; the others are text ranges within the <test> element, each containing a single space. All of these text ranges correspond to line breaks in the input document: after the <test> start-tag, after the end-tag for the first <p>, and after the end-tag for the last <p>.

  The XML recommendation mandates that all white space is significant. Therefore, FM is correct not to discard it. Converting the line breaks to spaces is consistent with FM's treatment of line breaks within a paragraph.

  When you open the same file using a DTD that does not permit text ranges between <p> elements, FM does not create the text ranges.

  While surprising at first, I believe the behavior is correct. I therefore didn't bother testing in FM 11 or FM 12.

  The catch is that if you want white space after xrefs treated correctly, you need to turn the option off and if you want to avoid line breaks coming in as data characters, you need to turn the option on. Solutions are:

1) Use a DTD

2) Preprocess the input to remove line breaks that format the XML

3) Preprocess the input to change a space after an xref to a character reference

4) Hope Adobe fixes the bug soon

  --Lynne

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Feb 26, 2016 0
Mentor ,
Mar 01, 2016

Copy link to clipboard

Copied

Hi Lynne,

Thanks for the detailed reply. Everything you say makes sense and I didn't really think about how FM would use DTD rules to interpret linebreaks. That's very interesting.

The reason for this issue is because I have a routine that cobbles together XML files that are composites from other XML instances authored in FrameMaker. These are schema-controlled, but when I put the content together, I get problems like duplicate IDs and general invalidity due to some amateur XSLT. So, I just remove the schema reference, which seemed to work as I expected in FM11. But clearly something is different now. The management of linebreaks has changed, for whatever reason.

As the keeper of the code that drives all this stuff, it was a simple matter to remove all line breaks before writing the composite XML file. That appears to have solved the problem. The XML files still have no schema or DTD declaration, but FM chooses the correct structure app for them and they look the way I want. I guess at some point the EDD takes over and completes the expected rendering. I wonder if that was the change... the consideration of EDD rules.

Russ

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 01, 2016 0
Adobe Community Professional ,
Mar 01, 2016

Copy link to clipboard

Copied

Russ,

    The RemoveExtraWhiteSpacesOnXMLImport option is described in the FM 11 INI Reference--I don't know if the bug with white space after xrefs exists in FM 11. I looked briefly at the FM 11, 12, and 2015  documentation and didn't see anything that suggests a change In handling of white space in XML. I don't have time to do any testing now.

   In any case, when FM opens an XML document, it does apply the format rules from the EDD. (For performance reasons, it actually turns formatting off until the entire document has been imported and then formats the entire document.)

   Are you combining documents with XSLT or something else? You can avoid ID conflicts be appending a prefix or suffix to the original IDs. If you change IDREFs as well, xrefs should be preserved. For example, I've been working on a project in which end users may very well create a new book component from a copy of an existing one; likely resulting in duplicate IDs. I can use the root element of the book component or the ?FM Document PIs to locate the start of a component and then append a suffix such as a component counter to each ID and IDREF. That way, I ensure there are no duplicate IDs, but I still preserve xrefs.

    --Lynne

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 01, 2016 0
Mentor ,
Mar 02, 2016

Copy link to clipboard

Copied

Lynne, all good advice. The process is complicated with a lot of backend automation that is moving stuff around. I think it could be better engineered to avoid the problem I saw in the first place, but it was all working until FM2015 so I didn't think about it. Maybe I'll think about it some more because I think it the process would be more robust if all XML was valid against the same schema. Clearly things become more fragile when they are not as clearly defined.

Russ

Likes

Translate

Translate

Report

Report
Community Guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
Reply
Loading...
Mar 02, 2016 0