• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

REreplace[NoCase](): string only? Or can a function be the replacement?

LEGEND ,
Feb 04, 2016 Feb 04, 2016

Copy link to clipboard

Copied

Hello, all,

According to Adobe documentation, REreplace() (ergo, REreplaceNoCase(), too) requires three arguments: the target string; the regex match; and the value to replace that match with.  The docs also state that the third argument is a string.  Nothing is mentioned of using a function for the third argument.

But in JavaScript (ex: str.replace(x,y)), while the prefix str is the target string, the two arguments that go inside the parenthesis are the regex or substring match, and the replacement - which can be either a string, or a function.

The reason I bring this to attention is because in the past I have needed (and now would like) to write something that will convert HTML entities into the character said entity is supposed to represent.  But I have had no success by using just a string as the replacement.

For example:  Say I have a string, such as "This is a test&#59; this is only a test."  (Yeah, I know, VERY basic.)  Well, I don't always know that what I need to replace will be &#59;.  It could be &#XXXX;, for all I know.  But I do know that (at least most if not all the time) &#59; in CF is chr(59).

So I tried several variations of:

REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example --->

None worked.  Then (today) it occurred to me - a function might be able to do it.  But the docs indicate that only a string can be the replacement for REreplace[NoCase]().

Frustrating.

Okay.. now I'm not sure where I'm going with this, coz it really isn't a question.  But I guess that I'm wondering if Adobe has any plans to update REreplace() so that a function can be used as the replacement.  Or, if anyone knows of an alternative.

V/r,

^_^

Views

903

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Feb 04, 2016 Feb 04, 2016

Copy link to clipboard

Copied

I think a lot of the functions like this Adobe simply puts a thin wrapper over an underlying java function. If java has an equivalent, I'm sure CF won't be long. In the meantime - again if java has an equivalent to what you are asking - you can call java directly (it might take a little reverse engineering).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 04, 2016 Feb 04, 2016

Copy link to clipboard

Copied

Hi, Steve Sommers‌,

Thanks for responding.  In most environments, that is possible.

In a USG DoD environment, we are denied from having access to creating or directly accessing Java objects. 

MOST anything that can be done via CF TAG or CFSCRIPT, no problem.  But I can't do anything like:

<cfset thisVariable = createObject("java","foo") />

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Guide ,
Feb 04, 2016 Feb 04, 2016

Copy link to clipboard

Copied


@Wolfshade, since ColdFusion runs on top of Java (as a Java Servlet App), you have the ability to access native Java built-in functions and objects.  You are referring to loading additional Java (3rd party) objects (using CreateObject).

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 04, 2016 Feb 04, 2016

Copy link to clipboard

Copied

Hi, Carl Von Stetten‌,

SWEET!  I've never done that, though.  Do you know of any online tutorials for stuff like that?

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 07, 2016 Feb 07, 2016

Copy link to clipboard

Copied

WolfShade wrote:

Hello, all,

According to Adobe documentation, REreplace() (ergo, REreplaceNoCase(), too) requires three arguments: the target string; the regex match; and the value to replace that match with.  The docs also state that the third argument is a string.  Nothing is mentioned of using a function for the third argument.

But in JavaScript (ex: str.replace(x,y)), while the prefix str is the target string, the two arguments that go inside the parenthesis are the regex or substring match, and the replacement - which can be either a string, or a function.

The reason I bring this to attention is because in the past I have needed (and now would like) to write something that will convert HTML entities into the character said entity is supposed to represent.  But I have had no success by using just a string as the replacement.

For example:  Say I have a string, such as "This is a test&#59; this is only a test."  (Yeah, I know, VERY basic.)  Well, I don't always know that what I need to replace will be &#59;.  It could be &#XXXX;, for all I know.  But I do know that (at least most if not all the time) &#59; in CF is chr(59).

So I tried several variations of:

  1. REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example ---> 

None worked.  Then (today) it occurred to me - a function might be able to do it.  But the docs indicate that only a string can be the replacement for REreplace[NoCase]().

Frustrating.

Okay.. now I'm not sure where I'm going with this, coz it really isn't a question.  But I guess that I'm wondering if Adobe has any plans to update REreplace() so that a function can be used as the replacement.  Or, if anyone knows of an alternative.

You could just write a simple Coldfusion function that does what you want. Something like

<cfset testString = "This is a test&##59; this is only a test.">

<cfoutput>#replaceNoCaseCustom(testString)#</cfoutput>

<cffunction name="replaceNoCaseCustom" returntype="string">

    <cfargument name="inputString" type="string">

    <cfset var outputString = arguments.inputString>

    <cfloop from="58" to="62" index="indx">

    <cfset outputString = replaceNoCase(outputString,"&##" & chr(indx) & ";", chr(indx), "all")>

    </cfloop>

    <cfreturn outputString>

</cffunction>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 08, 2016 Feb 08, 2016

Copy link to clipboard

Copied

Thanks, BKBK‌!  That is awesome.  However, I'm looking to replace ALL instances, not just chr(58) through chr(62).

I've seen some like &#174; (registered), &#169; (copyright), &#8211; (en dash), &#8220; and &#8221; (left and right MS "smart" quotes), and more.  (Sometimes, the users will enter text into MS Word, then copy/paste that into the field.)

If I were to loop that for everything, I think it would be pretty processor intensive.  The RegEx should be less strain on the server CPU (I think.)

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 09, 2016 Feb 09, 2016

Copy link to clipboard

Copied

@Wolfshade

What did you make of my last suggestion? Is that what you're looking for? It does solve your problem in a single line of code.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 10, 2016 Feb 10, 2016

Copy link to clipboard

Copied

Apologies, BKBK‌.  I have not had a chance, yet, to implement it (this is for my side project at home).  I'm blown away by how simple it is.  Could you walk me through parts of it?

What are <name1> and <name2>?  Are those backreferences?

\&##?<name1>\d+?<name2>;

I can see the escaped hashmark for CF.  I'm assuming that you have to backslash escape ampersands in Java.  If this is RegEx, the question mark means "zero or one of preceeding character".  How do name1 and name2 play into this?  \d is the character number, obviously.

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 10, 2016 Feb 10, 2016

Copy link to clipboard

Copied

Hi WolfShade,

The regex is actually ?<name>, taken together.

See for example Regex Tutorial - Named Capturing Groups - Backreference Names

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

Thanks for the link, BKBK.  Sadly, DoD blocks all sites with .info TLDs.  I will Google for Named Capturing Groups.

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

WolfShade wrote:

Sadly, DoD blocks all sites with .info TLDs.  I will Google for Named Capturing Groups.

@ WolfShade.

Nevermind. You have to ignore my last (untested) attempt anyway. Here is a one-liner that works:

<cfset testString = "This is a test &##61; this is only a test. This is a test; &##60;this is only a test&##62;. This is a test &##64; this is only a test. &##40;This is a test; this is only a test.&##41;">

<cfset outputString = evaluate(de(testString.replaceAll("&##(?<name1>\d+);",'##chr(${name1})##')))>

<cfoutput>

<strong>testString:</strong> #testString#<br>

<strong>outputString:</strong> #outputString#

</cfoutput>

Output:

testString: This is a test &#61; this is only a test. This is a test; &#60;this is only a test&#62;. This is a test &#64; this is only a test. &#40;This is a test; this is only a test.&#41;<br>

outputString: This is a test = this is only a test. This is a test; <this is only a test>. This is a test @ this is only a test. (This is a test; this is only a test.)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

Hi, c_wigginton and Pete_Freitag,

I think you can use canonicalize() without declaring the ESAPI, first.  I did try that, and for some reason it missed a few of my test strings (got most of them, but two or three slipped past.)  I also think that if you set the second and third arguments to true, it will not throw an exception when it finds nested HTML entities.  I'll give it another shot - maybe I missed something the first time.

BKBK‌, you had me up until I saw evaluate().    I never use eval(uate); not in CF, not in JavaScript.  As a tagline I once saw states:  "eval(x,y); - The Axis of Eval"

V/r,

^_^

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

WolfShade wrote:

BKBK, you had me up until I saw evaluate().    I never use eval(uate); not in CF, not in JavaScript.  As a tagline I once saw states:  "eval(x,y); - The Axis of Eval"

Fair enough. I remain with the satisfaction that it answers your original question in just one line of code.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
LEGEND ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

BKBK wrote:

Fair enough. I remain with the satisfaction that it answers your original question in just one line of code.

True, dat!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

LATEST

Canonicalized some more. Handy function!

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

WolfShade wrote:

Hi, c_wigginton and Pete_Freitag,

I think you can use canonicalize() without declaring the ESAPI, first.  I did try that, and for some reason it missed a few of my test strings (got most of them, but two or three slipped past.)  I also think that if you set the second and third arguments to true, it will not throw an exception when it finds nested HTML entities.  I'll give it another shot - maybe I missed something the first time.

I also tried canonicalize (my opportunity to call it for the first time!). Using 2 'false' args solves your problem, too:

<cfset testString = "This is a test &##61; this is only a test. This is a test; &##60;this is only a test&##62;. This is a test &##64; this is only a test. &##40;This is a test; this is only a test.&##41;">

<cfset outputString = canonicalize(testString,false,false)>

<cfoutput>

<strong>testString:</strong> #testString#<br>

<strong>outputString:</strong> #outputString#

</cfoutput>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Advocate ,
Feb 08, 2016 Feb 08, 2016

Copy link to clipboard

Copied

Maybe something like this:

public string function myCustomReplace( required string str ){

  local.pattern = "&##(\d);";

  local.items = reMatch(local.pattern,arguments.str);

  for(local.item in local.items){

   local.replacementValue = val(reReplace(local.item,local.pattern,"\1"));

   arguments.str = replace(arguments.str,local.item,chr(local.replacementValue),"ALL");

  }

  return arguments.str;

}

Not tested; may have typos but you should be able to get the gist.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Feb 08, 2016 Feb 08, 2016

Copy link to clipboard

Copied

WolfShade wrote:

So I tried several variations of:

  1. REreplaceNoCase( str, '&##(\d);', '#chr(\1)#', 'all' )  <!--- and more, but I'm kind of tired, so I'm only giving one example ---> 

None worked.

A neat solution, borrowed from Java:

  <cfset transformedString = str.replaceAll("\&##?<name1>\d+?<name2>;", "${name1}")>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Engaged ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

<cfset esapi = createObject("java", "org.owasp.esapi.ESAPI") />

<cfset foo = "&ndash;,&mdash;,&iexcl;,&iquest;,&quot;,&ldquo;&##9744;" />

<cfoutput>

    #esapi.encoder().canonicalize(foo)#

</cfoutput>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Enthusiast ,
Feb 11, 2016 Feb 11, 2016

Copy link to clipboard

Copied

Use the canonicalize function builtin to CF10+ canonicalize Code Examples and CFML Documentation or you can use the method @c_wigginton posted for CF8-9 fully patched (it includes ESAPI jars if fully patched).

The canonicalize function can reverse HTML entities, URL Encoding and javascript character encoding. It can also deal with nested or mixed encoding (by throwing an exception since it usually signals an attack or by attempting to handle it)

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation