Skip to main content
Inspiring
January 13, 2011
Question

Extract email address from html

  • January 13, 2011
  • 2 replies
  • 2415 views

Hi,

I am trying to extract "email address"  from an html output query. How would I do that?

I am on CF9.

example:

Query col1:

<html><head></head><body>today they emailed about it from (mailto:xxx@hotmail.com) ...hello there and here</body></html>

This topic has been closed for replies.

2 replies

Inspiring
February 9, 2011

Here's a function I wrote for use on some of our CF sites:

<cffunction access="public" name="isEmailAddressValid" returntype="boolean">
    <cfargument name="email" type="string" required="yes">

    <cfif refindnocase("^([_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*\.(([a-z]{2,3})|(aero|coop|info|museum|name)))?$",arguments.email) neq 0>
        <cfreturn true>
    <cfelse>
        <cfreturn false>
    </cfif>

</cffunction>

This should get you started, along with the referenced CF documentation, you should be able to use this to extract an address.

Good luck!

bh

Inspiring
February 9, 2011

Argh!  No!

God I hate it when people knock together a regex like this and go "Look!  Email address validation!"

Before one starts down this road, one should read the RFC (http://tools.ietf.org/html/rfc5322, summarised here: http://en.wikipedia.org/wiki/Email_address).

Your own regex fails my spamtrap email address (for example: adam.cameron.signup+adobeforums@gmail.com), because you've forgotten that a + is a legitimate character in the local part of an email address.  Along with a bunch of other completely legit characters.

Reading on through the RFC you will realise than ANYTHING is valid in the local part of an email address, provided it's quoted (double-quote being another character your regex doesn't accept).

If someone doesn't want to give you their valid email address, they won't.  I can give you adam@notmyaddress.com, and that will pass.  If I do want to give you my address, you should make sure your code will actually accept it!

I can understand wanting to make sure the punter doesn't key their email address in incorrectly, but your method doesn't help here.  It'd pass adan@ismyaddress.com, despite the fact that it should be adam@ismyaddress.com.  "Close" is not good enough in these cases.

The only sensible way of doing this is to ask them to type it in twice.  This will assist people who don't just roll their eyes and copy and paste what they typed in the first box into the second box, wondering why you're wasting their time.  So a typo will be transferred, so it's no help.

If you really want to get a person's email address, deprive them of something until they respond to an email that you end them.  At the email address they specified. Because they actually don't mind you having their email address.  This only works if you're not simply trying to harvest email addresses for your own benefit, and not the benefit of your subscribers.

Bottom line: email address is a mug's game, and one not often played by people who know the rules.

--
Adam

Inspiring
February 9, 2011

Listen, congrats on your thesis, man.

My function will get him started, you've yet to provide anything to help get the guy going.

He's asking about EXTRACTING email addresses from a lengthy string of HTML.

Your advise on "entering twice" is moot in this regard.

Instead of getting excited about my apparently insufficient regex, why don't you read the original request and try HELPING.

ilssac
Inspiring
January 13, 2011

Regular Expressions are often the tool to use for that kind of string manipulation.

ColdFuion has the reFind() and reReplace() functions to tap into a large part of the power of Regular Expressions.

emmim44Author
Inspiring
January 14, 2011

I cannot setup the reqular expr. I need some sample///

Inspiring
January 14, 2011

Here are some resources to help get you started using regular expressions:

The CF documentation

http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSc3ff6d0ea77859461172e0811cbec0a38f-7ffb.html

Tutorial website

http://www.regular-expressions.info/

Ben Forta's ColdFusion books have coverage of regular expressions, at least in the CF6,7, and 8 editions that I own.

http://www.forta.com/books/