Copy link to clipboard
Copied
Hello
I've encountered a strange issue with some strings when using the FrameMaker API. We have an FAPI client which translates FM documents by reading in their contents, replacing their text with the new translated text and writing back the changes.
However, I've found that some strings are beginning with an \r (carriage return) character when we read them in, and when we write them back out, they begin with \r\n, which moves the text down a line, and can break things like cross-references etc. I'm not exactly sure why this is happening, we're using Visual C++ and we do a lot of conversions between CStrings and StringT types so maybe that conversion process is doing it.
To be honest, I find the occurrence of newlines and carriage returns in Frame strings a bit odd as the Frame API breaks the text in a paragraph into fragments for you anyway, so I would have thought the usage of such characters is redundant. This is a big assumption though so correct me if I'm wrong.
We've implemented a work around whereby our program strips newlines out of strings before writing them back
but in reality, all we really need to do is strip leading newlines from strings, before the actual printable text in a string begins.
I'm wondering what others have to say about this approach WRT its safety. Any opinions would be appreciated, thanks
Eric
Eric,
I did a little more experimentation, based on my previous comments (sorry, should have done that before I clicked Post before.) When I programmatically add a \r to a paragraph, I see no effect in the layout. This is consistent with my suspicions so far. I'm really thinking that perhaps these characters are spurious and should be removed at the outset.
Russ
Copy link to clipboard
Copied
Hi Eric,
I don't have any solid answers for you, but you have roused my curiosity. Maybe with a bit of discussion I can help some. I am far from the expert on text management within a document, as a majority of my experience is with structural metadata work. However, I have done a fair amount of it.
I concur that the appearance of the \r character (assuming you mean ASCII 13) is odd. For a normal flowing paragraph, I don't think any of the text (FTI_String items) should have any of these characters. I believe that all types of line and paragraph breaks have their own text item types that control the layout within Frame, so they should not be picked up in an FTI_String item. I ran some experiments where I put my own soft returns and other anomalies in there using the normal interface, but still could not pick them up within FTI_String captures.
I agree that the addition of \n is probably a datatype conversion issue, so I'm thinking that maybe the real problem is the original existence of the \r characters. Do they appear to be doing anything in the document where you are finding them? If not, have you considered simply removing them before any further processing?
Given what I think I know (which granted may not be much), I'm not even sure how you get those characters in there, unless they were pasted in somehow, or perhaps originated in a text-based format of the file (like MIF) when edited by some text editor.
Just some thoughts, hope they are of some value.
Russ
Copy link to clipboard
Copied
Hi Russ
Thanks for your replies - you always say you don't have any solid answers for me or "I'm no expert" and then go on to suggest something which solves my problem for me entirely . Your idea of stripping the carriage returns (which indeed are ASCII 13) worked perfectly, and is the safer option for us to us as it means we don't have to recalculate the position of override beginning and end points. It had occurred to me earlier, but I just assumed they were actually serving a purpose on the page - your investigation proved otherwise, hopefully this thread will save other people time
Thanks a lot
Eric
Copy link to clipboard
Copied
Well, that's good news once again. I just want to be careful about acting like I have all the answers here, because I've never really done anything quite like you are attempting. In reference to this issue, though, if the character is serving no textual or formatting purpose, I can't imagine why it needs to be there. I hope I am not wrong... I suppose it is possible that FrameMaker inserts them for some reason, but that doesn't make any sense in relation to anything I know about the application.
Russ
Copy link to clipboard
Copied
Once again, I just thought of something else after I clicked Post. If those characters are indeed extraneous, it is probably a good thing to get rid of them in general. It seems that FrameMaker is happy to ignore them, but in the future, if you decided to export/publish to another format, they might have some very undesirable effects. An example might be exporting to XML and publishing via HTML... who knows what a browser or other content-rendering app might think of them.
Russ
Copy link to clipboard
Copied
Hi Russ
I've implemented carriage return stripping and it seems to be work without side effects.
I am inclined to share your view that these characters are spurious. It's hard
to get a concrete answer really because all we have as testing fodder
are sample files our clients send us, which limits both the size of our testing
base and how representative it is of the type of files our customers
are translating.
If I hit any regressions I'll let you know, but for now the signs are good
Thanks again
Eric
Copy link to clipboard
Copied
Eric,
I did a little more experimentation, based on my previous comments (sorry, should have done that before I clicked Post before.) When I programmatically add a \r to a paragraph, I see no effect in the layout. This is consistent with my suspicions so far. I'm really thinking that perhaps these characters are spurious and should be removed at the outset.
Russ
Find more inspiration, events, and resources on the new Adobe Community
Explore Now