Skip to main content
rombanks
Inspiring
July 11, 2013
Answered

Replacing all text in book with sample text

  • July 11, 2013
  • 1 reply
  • 4534 views

Hello fellows,

Is there a way to replace all the textual content in a book with some sample text using Extendscript?

If yes, could you please point me to the relevant APIs?

Thank you for your help in advance!

This topic has been closed for replies.
Correct answer 4everJang

Yes, the point that Rick was making crossed my mind, too. Using regular expressions, you can easily specify which characters should be replaced. Listing a number of same-width characters and replacing them with an "x" is as easy as this:

sTextString.replace ( /[abcdefghknopqrsuvyz]/g, "x" );

This would leave the non-mentioned characters as they are. Depending a little bit on the fonts that are used, of course, as they might have different character widths for more than the ones I left out. But even if you only replace a couple of characters throughout the doc, it would be enough to get the desired effect.

The difficulty is getting to the text strings and also to replace them. There are many objects for which a GetText method exists, but you have to set the flags such that you actually get the right text strings out of that method. And then you have to figure out where in the doc the text string is, then delete the existing one and add the replacement. All of this has to be done without deleting any markers or anchors.

Another approach might be to set the text location to the first character in the main flow and then walking through the entire document character by character, testing each one and replacing it where required. But I don't think you would get into tables with that method, as those are linked to the running text via an anchor in that running text. So the table cells will have to be processed separately. And if you do have a method to change text in table cells, you can use that method on all paragraphs.

I think Rick has more experience in tweaking text strings. I am usually working on structured documents and only handling the element objects and their hierarchy, not so much changing the text content of documents. Rick, do you have an approach to walk through all text strings in a document and replace them without breaking anything ?

Ciao

Jang


OK, I have figured it out. Here is a script that works across all text in the main flow after opening the document.

var oDoc = app.ActiveDoc;

var oRange = oDoc.TextSelection;

var oPgf = oRange.beg.obj;

var oTLoc1 = new TextLoc;

var oTLoc2 = new TextLoc;

var oTRange = new TextRange;

var sNewTxt;

while ( oPgf.ObjectValid ( ) )

{

          var oTexts = oPgf.GetText ( -1 );

          oTLoc1.obj = oPgf;

          oTLoc2.obj = oPgf;

          for ( i = 0; i < oTexts.length; i++ ) {

                    if ( oTexts.dataType == Constants.FTI_String ) {

                              oTLoc1.offset = oTexts.offset;

                              oTLoc2.offset = oTexts.offset + oTexts.sdata.length;

                              oTRange.beg = oTLoc1;

                              oTRange.end = oTLoc2;

                              oDoc.TextSelection = oTRange;

                              oDoc.Clear ( 0 );

                              sNewTxt = oTexts.sdata.replace ( /[a-z]/g, 'x' );

                              sNewTxt = sNewTxt.replace ( /[A-Z]/g, 'X' );

                              oDoc.AddText ( oTLoc1, sNewTxt );

                    }

          }

          oPgf = oPgf.NextPgfInDoc;

}

1 reply

4everJang
Legend
July 14, 2013

Of course this is possible, but it might get very tricky depending on the structure of the document you want to process and on the type of sample text you want to replace the content with. If the sample text should more or less look like readable text, you have to figure out a way to cut that text up into substrings and place them in the right locations so that the end result looks like the sample you are trying to create. It is easier if the output can be complete bogus, as you will not need to care about the length of the text strings that are replaced in each of the paragraphs or subparagraphs your script will find.

Lots of issues to handle. Possibly, using the Find and Replace function would be something to look at in the FrameMaker Scripting Guide. But that function is not exactly the easiest to handle from a script. The other option is to walk through all paragraphs in the flow, then walk through all anchored frames, figure out if they have paragraphs in them, then walk through all table cells and process the text in those.

An important issue is what to do with the many markers and anchors that FrameMaker puts in the text flow. These may or may not take up space in the text (depending on the type of marker), and you do not want to remove all of them. You may want to remove the index and cross-reference markers, but certainly not the anchors for tables and such. Also, walking through all paragraphs in a flow does not lead you through the text that appears in tables or in anchored frames.

Good luck

Jang

rombanks
rombanksAuthor
Inspiring
July 14, 2013

Hi Jang,

I appreciate your response and detailed explanations!

I changed my mind - I would not replace the content with some bogus text as it may also change the amount of text lines present in files - and that's what I would like to avoid.

I would rather prefer to obfuscate the existing text, without changing its amount and flow.

Are you aware of any tools/APIs that can do that?

Thanks for your suggestions in advance!

4everJang
Legend
July 14, 2013

It would still come down to walking through the entire list of paragraphs (in all flows that appear on body pages), then through all table cells and also through any text appearing in anchored frames, although I assume that leaving the text in anchored frames (call-outs or sidehead notes) untouched might be acceptable.

I will have a look at the required loops and post a possible solution later today. I do not want to post code that I have not tested first. And this type of script might come in handy for some of my clients, too.

I am assuming that replacing every single character with an 'x' does the trick for you? It would not change the formatting or text flow.

Jang