Skip to main content
WolfShade
Legend
February 4, 2016
Question

REreplace[NoCase](): string only? Or can a function be the replacement?

  • February 4, 2016
  • 5 replies
  • 1934 views

Hello, all,

According to Adobe documentation, REreplace() (ergo, REreplaceNoCase(), too) requires three arguments: the target string; the regex match; and the value to replace that match with.  The docs also state that the third argument is a string.  Nothing is mentioned of using a function for the third argument.

But in JavaScript (ex: str.replace(x,y)), while the prefix str is the target string, the two arguments that go inside the parenthesis are the regex or substring match, and the replacement - which can be either a string, or a function.

The reason I bring this to attention is because in the past I have needed (and now would like) to write something that will convert HTML entities into the character said entity is supposed to represent.  But I have had no success by using just a string as the replacement.

For example:  Say I have a string, such as "This is a test&#59; this is only a test."  (Yeah, I know, VERY basic.)  Well, I don't always know that what I need to replace will be &#59;.  It could be &#XXXX;, for all I know.  But I do know that (at least most if not all the time) &#59; in CF is chr(59).

So I tried several variations of:

REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example --->

None worked.  Then (today) it occurred to me - a function might be able to do it.  But the docs indicate that only a string can be the replacement for REreplace[NoCase]().

Frustrating.

Okay.. now I'm not sure where I'm going with this, coz it really isn't a question.  But I guess that I'm wondering if Adobe has any plans to update REreplace() so that a function can be used as the replacement.  Or, if anyone knows of an alternative.

V/r,

^_^

    This topic has been closed for replies.

    5 replies

    Participating Frequently
    February 11, 2016

    <cfset esapi = createObject("java", "org.owasp.esapi.ESAPI") />

    <cfset foo = "&ndash;,&mdash;,&iexcl;,&iquest;,&quot;,&ldquo;&##9744;" />

    <cfoutput>

        #esapi.encoder().canonicalize(foo)#

    </cfoutput>

    pete_freitag
    Participating Frequently
    February 11, 2016

    Use the canonicalize function builtin to CF10+ canonicalize Code Examples and CFML Documentation or you can use the method @c_wigginton posted for CF8-9 fully patched (it includes ESAPI jars if fully patched).

    The canonicalize function can reverse HTML entities, URL Encoding and javascript character encoding. It can also deal with nested or mixed encoding (by throwing an exception since it usually signals an attack or by attempting to handle it)

    BKBK
    Community Expert
    Community Expert
    February 8, 2016

    WolfShade wrote:

    So I tried several variations of:

    1. REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example ---> 

    None worked.

    A neat solution, borrowed from Java:

      <cfset transformedString = str.replaceAll("\&##?<name1>\d+?<name2>;", "${name1}")>

    Legend
    February 8, 2016

    Maybe something like this:

    public string function myCustomReplace( required string str ){

      local.pattern = "&##(\d);";

      local.items = reMatch(local.pattern,arguments.str);

      for(local.item in local.items){

       local.replacementValue = val(reReplace(local.item,local.pattern,"\1"));

       arguments.str = replace(arguments.str,local.item,chr(local.replacementValue),"ALL");

      }

      return arguments.str;

    }

    Not tested; may have typos but you should be able to get the gist.

    BKBK
    Community Expert
    Community Expert
    February 7, 2016

    WolfShade wrote:

    Hello, all,

    According to Adobe documentation, REreplace() (ergo, REreplaceNoCase(), too) requires three arguments: the target string; the regex match; and the value to replace that match with.  The docs also state that the third argument is a string.  Nothing is mentioned of using a function for the third argument.

    But in JavaScript (ex: str.replace(x,y)), while the prefix str is the target string, the two arguments that go inside the parenthesis are the regex or substring match, and the replacement - which can be either a string, or a function.

    The reason I bring this to attention is because in the past I have needed (and now would like) to write something that will convert HTML entities into the character said entity is supposed to represent.  But I have had no success by using just a string as the replacement.

    For example:  Say I have a string, such as "This is a test&#59; this is only a test."  (Yeah, I know, VERY basic.)  Well, I don't always know that what I need to replace will be &#59;.  It could be &#XXXX;, for all I know.  But I do know that (at least most if not all the time) &#59; in CF is chr(59).

    So I tried several variations of:

    1. REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example ---> 

    None worked.  Then (today) it occurred to me - a function might be able to do it.  But the docs indicate that only a string can be the replacement for REreplace[NoCase]().

    Frustrating.

    Okay.. now I'm not sure where I'm going with this, coz it really isn't a question.  But I guess that I'm wondering if Adobe has any plans to update REreplace() so that a function can be used as the replacement.  Or, if anyone knows of an alternative.

    You could just write a simple Coldfusion function that does what you want. Something like

    <cfset testString = "This is a test&##59; this is only a test.">

    <cfoutput>#replaceNoCaseCustom(testString)#</cfoutput>

    <cffunction name="replaceNoCaseCustom" returntype="string">

        <cfargument name="inputString" type="string">

        <cfset var outputString = arguments.inputString>

        <cfloop from="58" to="62" index="indx">

        <cfset outputString = replaceNoCase(outputString,"&##" & chr(indx) & ";", chr(indx), "all")>

        </cfloop>

        <cfreturn outputString>

    </cffunction>

    WolfShade
    WolfShadeAuthor
    Legend
    February 8, 2016

    Thanks, BKBK‌!  That is awesome.  However, I'm looking to replace ALL instances, not just chr(58) through chr(62).

    I've seen some like &#174; (registered), &#169; (copyright), &#8211; (en dash), &#8220; and &#8221; (left and right MS "smart" quotes), and more.  (Sometimes, the users will enter text into MS Word, then copy/paste that into the field.)

    If I were to loop that for everything, I think it would be pretty processor intensive.  The RegEx should be less strain on the server CPU (I think.)

    V/r,

    ^_^

    BKBK
    Community Expert
    Community Expert
    February 10, 2016

    @Wolfshade

    What did you make of my last suggestion? Is that what you're looking for? It does solve your problem in a single line of code.

    Legend
    February 4, 2016

    I think a lot of the functions like this Adobe simply puts a thin wrapper over an underlying java function. If java has an equivalent, I'm sure CF won't be long. In the meantime - again if java has an equivalent to what you are asking - you can call java directly (it might take a little reverse engineering).

    WolfShade
    WolfShadeAuthor
    Legend
    February 4, 2016

    Hi, Steve Sommers‌,

    Thanks for responding.  In most environments, that is possible.

    In a USG DoD environment, we are denied from having access to creating or directly accessing Java objects. 

    MOST anything that can be done via CF TAG or CFSCRIPT, no problem.  But I can't do anything like:

    <cfset thisVariable = createObject("java","foo") />

    V/r,

    ^_^

    Carl Von Stetten
    Legend
    February 4, 2016


    @Wolfshade, since ColdFusion runs on top of Java (as a Java Servlet App), you have the ability to access native Java built-in functions and objects.  You are referring to loading additional Java (3rd party) objects (using CreateObject).