Copy link to clipboard
Copied
How would I strip out all the code and characters except for links using regex?
<font face="Arial" size="2"><a href="/Latitude-Longitude-545478-Louisiana-Achiquot_Bay.html">Achiquot Bay</a></font>
<font face="Arial" size="2">Bay
<font face="Arial" size="2">Avoyelles
<font face="Arial" size="2">LA
<font color="CCDFCA"></font>
<font face="Arial" size="2"><a href="/Latitude-Longitude-1629555-Louisiana-Adolph_Clarks_Pond__historical_.html">Adolph Clarks Pond (historical)</a></font>
<font face="Arial" size="2">Bay
<font face="Arial" size="2">Plaquemines
<font face="Arial" size="2">LA
<font color="CCDFCA"></font>
Copy link to clipboard
Copied
If you want to avoid writing your on regex, and use one of the HTML tag stripping functions at CFLIB.ORG instead, then just do 2 replaceNoCase() calls first: One to change "<a href" to something like "$<a href" and another to turn "</a>" into "$/a>. Then run the function, and then do the inverse replaceNoCase() functions to get the tag syntax back. You can in fact do the tag stripping of the non-href tags in a regex but it's going to get messy.
-reed
Copy link to clipboard
Copied
Does this work:
<cfset variables.m=REMatchNoCase("<a\s.*?<\/a>",variables.html) />
<cfdump var="#variables.m#" />
Copy link to clipboard
Copied
I wouldn't normally just dish out an answer on these forums, but that seemed like an interesting regex challenge, so I gave it a bash, and came up with this:
(?!</?\ba\b[^>]*?>)</?[a-z]+?[^>]*?>
I've given it superficial testing, and it seems to do the trick.
--
Adam