A number of years ago a very useful Adobe Blog post was made describing how to write regular expressions for FrameMaker's Find & Replace function.
I wanted to read that Friday because I wanted to try my hand at writing an expression, and it's gone. Any chance of its contents being made available again?
That link now automatically redirects to the Adobe Blog TechComm page. I've scrolled back to entries from 2014, and this one isn't there.
Depending on the complexity, perhaps I can help you work it out, Lin.
I use on our fellow ACP Peter Kahrel's book—https://www.amazon.com/GREP-InDesign-InDesignSecrets-Peter-Kahrel/dp/0982508387. It is written for InDesign, but regular expressions are regular expressions.
I appreciate the offer, but it was a short document so I brute-forced it. I was looking for a comma followed by a space followed by a capital letter (varying value) followed by a closed parenthesis, and I wanted to replace the space with a hard space. Regex would have allowed me to do a replace all instead of deciding for each individually but, as I said, short document. 🙂
All good now, but I recall that blog post being very informative about using regex in FM and I wish it weren't gone. I've been thinking that learning to use regex would be very, very useful, especially in situations like that.
Yes, you could have used regex for that situation and yes, it is very helpful for fixing anything that matches a regular pattern. Regex doesn't get much attention in FrameMaker or on the forums—it's the opposite of InDesign—but I find it invaluable for cleaning up and standardizing my documents.
You know what's really odd? The blog posts immediately before and after the one covering regex are still available on the TechComm blog.
«The blog posts immediately before and after the one covering regex are still available on the TechComm blog» Yes, after about 8 "show more ..." I got the same result.
Unfortuantely not all what is written there, holds its promises:
\s (white space) does not find: n-space, m-space, required space, numeric space and TAB. It only finds ordinary blanks, hard and soft line break. The reason might be, that the not-found items are not characters, but FM-functions… I have not yet tested out everything, in particular not the Unicode character ranges.
Nevertheless the post is very beneficial as it goes far beyond what is available in Help and which I had to find out by experiments for my project FMfindRepl.
Well, it is 7 years old. I'm sure a few things have changed in the intervening years.
But I'm glad I'm not the only one @Jeff_Coatsworth helped by finding the article.
Meantime, according to the FM help file:
By default, you use the Perl regular expression syntax to write regular expressions in FrameMaker. However, to use either the Grep or Egrep regular expression syntax, you need to update the Regular Expression Syntax flag in the maker.ini.
And according to the Perl site, you can use "\t" (I think not including the quote marks) for tabs and you can also use Unicode codes to find oddball characters.
Here's the direct link to the Perl regex site as given by the FM help file.
(I post the direct link here because if you click it in the Help file, the bloody thing opens in IE no matter what your default browser is set to. And then Microsoft nags to get you to switch to Edge.)
thin space: U+2009
non-breaking space: U+00A0
I like BabelMap for finding weird characters, and it's nice enough to also provide the Unicode values.
Thanks, Lin for reminding us about the Unicode thingies.
It turns out that in FM Requrired space, thin space, numberic space, m- and n-space are inserted by ESC sequences or \x## items. These are not found by RegEx \s.
But if you use the Unicode characters for these white space items, they are not correctly handled in FM - e.g. on HTML5 output. On the other hand, the Unicodes are found by RegEx \s.
Onother oddity is TAB. In FM the code \x08 is used, which is not found by \s. (it's the ASCII Back Space). The real ASCII TAB \x09 performs as soft line break. This one is found by \s ...
And when it comes to Unicode characters: looking for a lower case alpha (\u03B1). If there is the character α in the text, you find it (with or without RegEx) as such, but RegEx \u03B1 does not find the character α, but finds the string u03B1 ...
These were tests in FM-15, but IMHO it is the same since we have the RegEx capability in Find/Replace.
FM-11: In XML view search/replace supports regular expressions (.net flavour)
FM-12: Find/replace with regular expression support (3 flavours: PERL, EGREP, GREFP)
Due to these flavours we need to be careful with sophisticated RegExes...
Here is what I use to find leading and trailing spaces of any kind in FrameMaker:
// Leading and trailing spaces.
regex = /(?:^[\x08\x10-\x14\x20]|[\x08\x10-\x14\x20]$)/g;