Skip to main content
Known Participant
September 25, 2009
Answered

FDK: Problem with newlines / carriage returns

  • September 25, 2009
  • 2 replies
  • 1295 views

Hello


I've encountered a strange issue with some strings when using the FrameMaker API. We have an FAPI client which translates FM documents by reading in their contents, replacing their text with the new translated text and writing back the changes.

However, I've found that some strings are beginning with an \r (carriage return) character when we read them in, and when we write them back out, they begin with \r\n, which moves the text down a line, and can break things like cross-references etc. I'm not exactly sure why this is happening, we're using Visual C++ and we do a lot of conversions between CStrings and StringT types so maybe that conversion process is doing it.

To be honest, I find the occurrence of newlines and carriage returns in Frame strings a bit odd as the Frame API breaks the text in a paragraph into fragments for you anyway, so I would have thought the usage of such characters is redundant. This is a big assumption though so correct me if I'm wrong.

We've implemented a work around whereby our program strips newlines out of strings before writing them back

but in reality, all we really need to do is strip leading newlines from strings, before the actual printable text in a string begins.

I'm wondering what others have to say about this approach WRT its safety. Any opinions would be appreciated, thanks

Eric

    This topic has been closed for replies.
    Correct answer Russ Ward

    Eric,

    I did a little more experimentation, based on my previous comments (sorry, should have done that before I clicked Post before.)  When I programmatically add a \r to a paragraph, I see no effect in the layout. This is consistent with my suspicions so far. I'm really thinking that perhaps these characters are spurious and should be removed at the outset.

    Russ

    2 replies

    Russ WardCorrect answer
    Brainiac
    September 25, 2009

    Eric,

    I did a little more experimentation, based on my previous comments (sorry, should have done that before I clicked Post before.)  When I programmatically add a \r to a paragraph, I see no effect in the layout. This is consistent with my suspicions so far. I'm really thinking that perhaps these characters are spurious and should be removed at the outset.

    Russ

    Brainiac
    September 25, 2009

    Hi Eric,

    I don't have any solid answers for you, but you have roused my curiosity. Maybe with a bit of discussion I can help some. I am far from the expert on text management within a document, as a majority of my experience is with structural metadata work. However, I have done a fair amount of it.

    I concur that the appearance of the \r character (assuming you mean ASCII 13) is odd. For a normal flowing paragraph, I don't think any of the text (FTI_String items) should have any of these characters. I believe that all types of line and paragraph breaks have their own text item types that control the layout within Frame, so they should not be picked up in an FTI_String item. I ran some experiments where I put my own soft returns and other anomalies in there using the normal interface, but still could not pick them up within FTI_String captures.

    I agree that the addition of \n is probably a datatype conversion issue, so I'm thinking that maybe the real problem is the original existence of the \r characters. Do they appear to be doing anything in the document where you are finding them?  If not, have you considered simply removing them before any further processing?

    Given what I think I know (which granted may not be much), I'm not even sure how you get those characters in there, unless they were pasted in somehow, or perhaps originated in a text-based format of the file (like MIF) when edited by some text editor.

    Just some thoughts, hope they are of some value.

    Russ

    eric247Author
    Known Participant
    September 25, 2009

    Hi Russ

    Thanks for your replies - you always say you don't have any solid answers for me or "I'm no expert" and then go on to suggest something which solves my problem for me entirely . Your idea of stripping the carriage returns (which indeed are ASCII 13) worked perfectly, and is the safer option for us to us as it means we don't have to recalculate the position of override beginning and end points. It had occurred to me earlier, but I just assumed they were actually serving a purpose on the page - your investigation proved otherwise, hopefully this thread will save other people time

    Thanks a lot

    Eric

    Brainiac
    September 25, 2009

    Well, that's good news once again. I just want to be careful about acting like I have all the answers here, because I've never really done anything quite like you are attempting. In reference to this issue, though, if the character is serving no textual or formatting purpose, I can't imagine why it needs to be there. I hope I am not wrong... I suppose it is possible that FrameMaker inserts them for some reason, but that doesn't make any sense in relation to anything I know about the application.

    Russ


    Once again, I just thought of something else after I clicked Post. If those characters are indeed extraneous, it is probably a good thing to get rid of them in general. It seems that FrameMaker is happy to ignore them, but in the future, if you decided to export/publish to another format, they might have some very undesirable effects. An example might be exporting to XML and publishing via HTML... who knows what a browser or other content-rendering app might think of them.

    Russ