November 12, 2014 at 5:32 am
Hi Guys/Ladies.
So I have a few ways of doing it but I would like to know the BEST, FASTEST way of retrieving the domain from an email address. Though this is a small example in need to find the domain for about a mil emails a day.
Here is an example to get started.
SELECT 'a@abc.com' Email
INTO #emails
UNION ALL
SELECT 'a@abc.com'
UNION ALL
SELECT 'a@aasdf.com'
UNION ALL
SELECT 'sfawsed@dfsdf.com'
UNION ALL
SELECT 'asdfasdfasdf@asfdsbc.com'
UNION ALL
SELECT 'dffa@abc.com'
Thanks in advance
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
November 12, 2014 at 5:35 am
What's the domain exactly? Everything after the @?
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 12, 2014 at 5:41 am
Yes after the @
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
November 12, 2014 at 5:48 am
SELECT Email = 'a@abc.com'
INTO #emails
UNION ALL
SELECT 'a@abc.com'
UNION ALL
SELECT 'a@aasdf.com'
UNION ALL
SELECT 'sfawsed@dfsdf.com'
UNION ALL
SELECT 'asdfasdfasdf@asfdsbc.com'
UNION ALL
SELECT 'dffa@abc.com';
SELECT REVERSE(SUBSTRING(REVERSE(Email),1,CHARINDEX('@',REVERSE(Email))-1))
FROM #emails;
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 12, 2014 at 5:50 am
Thank you Will try and test the performance
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
November 12, 2014 at 6:07 am
I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.
This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.
November 12, 2014 at 6:09 am
Ed Wagner (11/12/2014)
I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.
Yeah, I didn't include data validation checks in my code, because I didn't want to go down that dirty road 😀
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
November 12, 2014 at 6:15 am
Yes the email is validated.
In a perfect world I just want the best and fastest way getting the domain info
--------------------------------------------------------------------------------------------------------------------------------------------------------
To some it may look best but to others it might be the worst, but then again to some it might be the worst but to others the best.
http://www.sql-sa.co.za
November 12, 2014 at 6:29 am
Koen Verbeeck (11/12/2014)
Ed Wagner (11/12/2014)
I hope you've already validated the format of the email address in your table. Koen's solution is probably (I haven't tested against other possibilities) screaming fast, but it will only pull the last domain from the address. For example, if you have a_user@123@abc.com (which is invalid), it will pull abc.com.This is probably what you're after, but if you don't have your address format validated, you could end up with something you don't expect. Just be aware of it before it happens so you're not surprised later.
Yeah, I didn't include data validation checks in my code, because I didn't want to go down that dirty road 😀
I know exactly what you mean. I've tried several approaches and with all the rules, it gets complicated rather quickly. In concept, you'd think that something as simple as an email address would be simple to validate, but it isn't so in practice.
November 12, 2014 at 6:41 am
Daniel Matthee (11/12/2014)
Yes the email is validated.In a perfect world I just want the best and fastest way getting the domain info
I've compared Koen's approach with the famous DelimitedSplit8K ITVF using a one million row test table. The difference is sub-second, but Koen's approach is just slightly faster. On my server, I was looking at about 4 seconds to split 1M rows using a Varchar(1000) with a maximum length of 18. While it's always best to conduct your own benchmarking, I'd say you can feel pretty safe with it.
November 12, 2014 at 12:41 pm
If the email address are, indeed, prevalidated, then there's no need for a triple reverse nor any use of DelimitedSplit8K, which actually returns too much information in this case. Try the following, instead.
SELECT SUBSTRING(Email,CHARINDEX('@',Email)+1,8000)
FROM #emails
;
Heh... I think it was Albert Einstein that said something like "Make it as simple as possible but no simpler". 😛
--Jeff Moden
November 12, 2014 at 1:33 pm
Jeff Moden (11/12/2014)
If the email address are, indeed, prevalidated, then there's no need for a triple reverse nor any use of DelimitedSplit8K, which actually returns too much information in this case. Try the following, instead.
SELECT SUBSTRING(Email,CHARINDEX('@',Email)+1,8000)
FROM #emails
;
Heh... I think it was Albert Einstein that said something like "Make it as simple as possible but no simpler". 😛
Right, I was too lazy to calculate the length of the substring that needed to be extracted (which is why I went with the REVERSE trick), but I forgot you can just specify a very big integer and SQL Server will just take it to the end 😀
Need an answer? No, you need a question
My blog at https://sqlkover.com.
MCSE Business Intelligence - Microsoft Data Platform MVP
Viewing 12 posts - 1 through 11 (of 11 total)
You must be logged in to reply to this topic. Login to reply