Skip to main content
Participating Frequently
February 24, 2011
Answered

RegExp - Swearing words filter - Exact match, no partial match

  • February 24, 2011
  • 2 replies
  • 8951 views

Hello,

I am working on a Swearing words filter which works find except that it replaces even when the match is partial instead of the exact match word. Let me explain.

I have a list of offensive words but for the purpose of the example let say that my list contains only one single offensive word and with it I build the following RegExp...

var regexpString:String = "(a55)";
patern = new RegExp(regexpString, "g|i");

..this filters as follow...

var str:String = "a55";

var filteredStr:String = str.replace(pattern, "**"); //will trace out ***, which is fine for me.

var str:String = "helloa55";

var filteredStr:String = str.replace(pattern, "**"); //will trace out hello*** which is not what I am looking for.

...this is what I would like to do differently, when I pass "helloa55", I would like to get the word without filtering because it is not a match of the offending word, I would like to get back the word helloa55 and not hello***.

I know I could use a function that will analyse the word and do all the necessary check but I was wondering if there is any simple and straighforward way of doing this, some kind of flag to add to the RegExp that will tund partial match to exact match.

Thank you.

This topic has been closed for replies.
Correct answer Kenneth Kawamoto

I love regex, but ill tell you it can be slow depending on how many words youre filtering.

kennethkawamoto2 has a good attempt with the regexp,

But what you should do is create an array of all the words you dont want.

then you can loop through each one to as a pattern.  you can ensure the words via \b which stands for word boundaries.

"like you bleep.   "  ok so we want to get rid of bleep .. since its a word on its own, we can get \bbleep\b/gi  as a pattern.  just itterate through all words in an array.. or better yet a Vector.<String>  and then replace bleep with the words.

i did not read more of what you were looking for.. but if you need specifics i can help with that.


Yeah \b is much simpler I must say

var _regExp:RegExp = /\b(key|keyer)\b/gi;
var str:String = "key and keyer and keyerrr"
trace(str.replace(_regExp, "***"));

2 replies

Community Expert
February 24, 2011

One way is to look for a word "a55" that is either at the beginning or preceded by a non-word character (such as space etc) AND either at the end or succeeded by a non-word character - in other words look for "a55" that is not part of a word.

trace(str.replace(/(^|\W)a55($|\W)/gi, "$1***$2"));

var str:String = "You are such an a55!";

// You are such an ***!

var str:String = "You are so a55uaring...!";

// You are so a55uaring...!

MaglezAuthor
Participating Frequently
February 25, 2011

This is looking better but I still have a problem, I was hopping this behaviour to be already implemented through some flag but I see it is not.

The issue I have now is that my RegExp won't find swearing words that are combinations of other swearing words. Let have 'key' and 'keyer' as swearing words...

  • var _regExp:RegExp = /key|keyer/g;
  • var str:String = "key and keyer and keyerrr"
  • trace(str.replace(_regExp, "***"));
  • This traces out *** and keyer and keyerrr.

    Any ideas on how to get *** and *** and keyerrr?

    Your posts have been of great help and I will really appreciate if you could post some other ideas.

    Community Expert
    February 26, 2011
    var _regExp:RegExp = /(^|\W)(key|keyer)($|\W)/g;
    var str:String = "key and keyer and keyerrr"
    trace(str.replace(_regExp, "$1***$3"));

    Traces

    *** and *** and keyerrr

    Participant
    February 24, 2011

    hi,

    I think you can use a levenshtein distance in this case ; it will retrieve the best matches within a list of needles ( A55 )

    in the case of "helloA55" will return a perfect match no matter what text in set before or after the needle "A55"

    check this, http://en.nicoptere.net/?p=854 (source code after the first flash animation )