Get Domain

  • Hi Guys/Ladies.

    So I have a few ways of doing it but I would like to know the BEST, FASTEST way of retrieving the domain from an email address. Though this is a small example in need to find the domain for about a mil emails a day.

    Here is an example to get started.

    SELECT 'a@abc.com' Email

    INTO #emails

    UNION ALL

    SELECT 'a@abc.com'

    UNION ALL

    SELECT 'a@aasdf.com'

    UNION ALL

    SELECT 'sfawsed@dfsdf.com'

    UNION ALL

    SELECT 'asdfasdfasdf@asfdsbc.com'

    UNION ALL

    SELECT 'dffa@abc.com'

    Thanks in advance

    --------------------------------------------------------------------------------------------------------------------------------------------------------
    To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
    http://www.sql-sa.co.za

  • What's the domain exactly? Everything after the @?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes after the @

    --------------------------------------------------------------------------------------------------------------------------------------------------------
    To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
    http://www.sql-sa.co.za

  • SELECT Email = 'a@abc.com'

    INTO #emails

    UNION ALL

    SELECT 'a@abc.com'

    UNION ALL

    SELECT 'a@aasdf.com'

    UNION ALL

    SELECT 'sfawsed@dfsdf.com'

    UNION ALL

    SELECT 'asdfasdfasdf@asfdsbc.com'

    UNION ALL

    SELECT 'dffa@abc.com';

    SELECT REVERSE(SUBSTRING(REVERSE(Email),1,CHARINDEX('@',REVERSE(Email))-1))

    FROM #emails;

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Thank you Will try and test the performance

    --------------------------------------------------------------------------------------------------------------------------------------------------------
    To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
    http://www.sql-sa.co.za

  • I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.

    This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.

  • Ed Wagner (11/12/2014)


    I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.

    This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.

    Yeah, I didn't include data validation checks in my code, because I didn't want to go down that dirty road 😀

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Yes the email is validated.

    In a perfect world I just want the best and fastest way getting the domain info

    --------------------------------------------------------------------------------------------------------------------------------------------------------
    To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
    http://www.sql-sa.co.za

  • Koen Verbeeck (11/12/2014)


    Ed Wagner (11/12/2014)


    I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.

    This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.

    Yeah, I didn't include data validation checks in my code, because I didn't want to go down that dirty road 😀

    I know exactly what you mean. I've tried several approaches and with all the rules, it gets complicated rather quickly. In concept, you'd think that something as simple as an email address would be simple to validate, but it isn't so in practice.

  • Daniel Matthee (11/12/2014)


    Yes the email is validated.

    In a perfect world I just want the best and fastest way getting the domain info

    I've compared Koen's approach with the famous DelimitedSplit8K ITVF using a one million row test table. The difference is sub-second, but Koen's approach is just slightly faster. On my server, I was looking at about 4 seconds to split 1M rows using a Varchar(1000) with a maximum length of 18. While it's always best to conduct your own benchmarking, I'd say you can feel pretty safe with it.

  • If the email address are, indeed, prevalidated, then there's no need for a triple reverse nor any use of DelimitedSplit8K, which actually returns too much information in this case. Try the following, instead.

    SELECT SUBSTRING(Email,CHARINDEX('@',Email)+1,8000)

    FROM #emails

    ;

    Heh... I think it was Albert Einstein that said something like "Make it as simple as possible but no simpler". 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (11/12/2014)


    If the email address are, indeed, prevalidated, then there's no need for a triple reverse nor any use of DelimitedSplit8K, which actually returns too much information in this case. Try the following, instead.

    SELECT SUBSTRING(Email,CHARINDEX('@',Email)+1,8000)

    FROM #emails

    ;

    Heh... I think it was Albert Einstein that said something like "Make it as simple as possible but no simpler". 😛

    Right, I was too lazy to calculate the length of the substring that needed to be extracted (which is why I went with the REVERSE trick), but I forgot you can just specify a very big integer and SQL Server will just take it to the end 😀

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply