• Global community
    • Language:
      • Deutsch
      • English
      • Español
      • Français
      • Português
  • 日本語コミュニティ
    Dedicated community for Japanese speakers
  • 한국 커뮤니티
    Dedicated community for Korean speakers
Exit
0

Any regExp expert here?

Explorer ,
Dec 25, 2016 Dec 25, 2016

Copy link to clipboard

Copied

How would you handle one or more hyphens in words? For instance, "T-cell" is a word,

<cfset w = "T-cell">,

REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

The above REreplace function failed to identify "T-cell" as one word since hyphen is used as an indicator for a range such as [a-z], according to Adobe documentation, thus, one needs to add the hyphen to the end to reference it as a literal, I attempted to do the following:

REreplace("a long text string T-cell and more...", "#w#-", "<a href=''>#w#</a>")  or

REreplace("a long text string T-cell and more...", "[#w#-]", "<a href=''>#w#</a>")

However, it misfired and caused serious problems. What's the correct regExp for this situation?

Many thanks

TOPICS
Advanced techniques

Views

610

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 26, 2016 Dec 26, 2016

Copy link to clipboard

Copied

Sorry if I have misunderstood your question, but if your intention is to change the word "T-cell" into "<a href=''>T-cell</a>" in the given example, the first option given by you itself will work fine.

There is nothing wrong in that at least what I could find. Because the hyphen would create an issue if it is inside a square bracket only.

So the piece of code

REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

will return

a long text string <a href=''>T-cell</a> and more... 

Please correct me if I have missed anything in your question.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Dec 26, 2016 Dec 26, 2016

Copy link to clipboard

Copied

Well, I've solved the problem.

But to appreciate your response and for the benefit of others...

I initially simplified the problem statement.  That is, I first read a simple HTML file into a var, then, ran

REreplace("#varName4longtext.", "#w#", "<a href=''>#w#</a>")

when var w contains "-" the above REreplace statement failed to identify w value in its entirety such as "T-cell", thus the question.

This is how to solve the problem after posting the question, I first replaced all of the occurrences of "-" into something else such as DASH and then ran REreplace (and probably replace would suffice as well...) and once done, I then revert the original "-" back.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 26, 2016 Dec 26, 2016

Copy link to clipboard

Copied

Thanks for sharing that. You were having troubl because the - is a special character in regex. You should therefore have escaped it with \.

Here is a suggestion:

<cfset w = "T-cell">

<cfset regex_w = "T\-cell">

...

<cfset newText = REreplaceNoCase(varName4longtext, regex_w, "<a href=''>#w#</a>", "all")>

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Explorer ,
Dec 27, 2016 Dec 27, 2016

Copy link to clipboard

Copied

No.  The suggestion of <cfset regex_w = "T\-cell"> does not make any sense for as I mentioned, var w could be "T-cell" or it could be anything else that contains one - or more of it, thus, the manual escape won't work.

In addition, as mentioned in my follow-up post, I've solved the problem albeit not using regexp technique.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 28, 2016 Dec 28, 2016

Copy link to clipboard

Copied

LATEST

There has been a misunderstanding. In your original question you wished to replace the whole word T-cell.

Even considering the case where there is an arbitrary number of -, I see a problem. What you mentioned above as part of your code, REreplace("#varName4longtext.", "#w#", "<a href=''>#w#</a>"), is likely to be where things went wrong.

You might have been aiming for REreplace(varName4longtext, "#w#", "<a href=''>#w#</a>", "all").

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
New Here ,
Dec 27, 2016 Dec 27, 2016

Copy link to clipboard

Copied

In Regex hyphen will be treated as a special character only when it is part of a character class ( that means it is inside square brackets)

Still in square brackets also, there's no issue when it is there at the end or beginning of the list of characters.

eg. [-abc] and [abc-] - in both of these hyphen  wont be treated as special character.

Also no matter whether you are using the long text directly or in a variable , there shouldnt be any difference in the behaviour of ReReplace function.

I doubt something else might have caused an error with your first trial with the ReReplace function.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 27, 2016 Dec 27, 2016

Copy link to clipboard

Copied

cp_anil@rediff wrote:

In Regex hyphen will be treated as a special character only when it is part of a character class ( that means it is inside square brackets)

Try this, and you will see that it works when you escape the - character:

<cfset regex_w = "T\-cell">

<cfset text = "Yes, T-cell is a match">

<cfset newText = REreplaceNoCase(text, regex_w, "the escaped regex")>

<cfoutput>#newtext#</cfoutput>

Also no matter whether you are using the long text directly or in a variable , there shouldnt be any difference in the behaviour of ReReplace function.

There is a difference in behaviour. ReReplace will match T-cell and T-Cell differently.

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Community Expert ,
Dec 26, 2016 Dec 26, 2016

Copy link to clipboard

Copied

ghfftttyyudsderycv76 wrote:

How would you handle one or more hyphens in words? For instance, "T-cell" is a word

That implies that you wish to replace T-cell with T*cell, where the * stands for a character other than - or no character at all. But then it seems from the rest of the text that that is not the question you wish to ask.

REreplace("a long text string T-cell and more...", "#w#", "<a href=''>#w#</a>")

As cp_anil@rediff correctly says, this suggests that you wish to replace T-cell with <a href=''>T-cell</a>. If that is the case, then you could avoid regular expressions altogether and use the simpler,

replaceNoCase("a long text string T-cell and more...", w, "<a href=''>#w#</a>", "all")

Votes

Translate

Translate

Report

Report
Community guidelines
Be kind and respectful, give credit to the original source of content, and search for duplicates before posting. Learn more
community guidelines
Resources
Documentation