Data mining

  • Hey guys,

    We have a system that was designed incorrectly and has bad data in it...

    SO I am writing queries to get the bad data...However this one has me stumped...

    Im trying to write a query that will check last names to see if any names have anything other that alpha characters in it... Ex. ab123cd

    Yes we have seen where somebody enters a last name with numbers in it...

    How do I write my WHERE clause to check for this?

    Thanks in advance...

  • select * from yourTable

    where Last_Name like '%[^a-z]%'

    However, this will include names with spaces in them as well, which you may also have. If you don't want those included, you will have to add another statement in the where clause to take care of that.

  • Thank you for your response!

  • I'm sure there will be more efficient ways, but something like this can get the job done.

    CREATE TABLE #Customers

    (

    LastName VARCHAR(30)

    ) ;

    INSERT #Customers

    VALUES ('Vanga') ;

    INSERT #Customers

    VALUES ('Va12gf3') ;

    SELECT LastName AS LastNameWithNumbers

    FROM #Customers

    WHERE LastName like '%[0-9]%'

    OR LastName like '%[0-9]'

    OR LastName like '[0-9]%' ;

  • So if Im looking for those that do have a number or some other character in it, do I need to put NOT LIKE instead of LIKE?

  • samvanga's method will get you the names that only have numerics in them if that is what you want. You would also only need the first statement in the where clause, the additional 2 are included in the first.

    If there are other special characters you would like the query to ignore that are common in last names like O'neil and Smith-Jones, you can add those characters in the brackets of my first query.

    select * from yourTabl

    where Last_Name like '%[^a-z-'']%'

  • The ^ sign at the beginning of the square brackets means not these characters. So where Last_Name like %[^a-z]% means find all last names that have any character that is not a through z.

  • @roryp 96873

    So do I use NOT LIKE instead of LIKE since Im trying to find those with special characters?

  • Oh ok...didnt know that!

    Thanks

  • It does get a little confusing with like, not like, and the ^ sign. My best suggestion would be to make a temp table with some examples of last names with different characters and play with the different combinations in your query until you get what you like. If you have quite a few names in your table I am sure you will run into some cases where there are characters other than alphas that are valid last names.

  • I've had to do this type of data scrubbing before. It can be a nightmare. I was able to take care of a lot of my problems using some sort of variation of substring and charindex. Patindex may be handy to you as well. Good luck!

  • How would you account for spaces in the last name?

  • Use the same code I had in my first post here:

    select * from yourTable

    where Last_Name like '%[^a-z]%'

    but instead put a space after the z:

    select * from yourTable

    where Last_Name like '%[^a-z ]%'

    If you want to account for any other characters that can normally occur in a last name, like hyphens, just put them after the space in the brackets:

    select * from yourTable

    where Last_Name like '%[^a-z -]%'

    Will get you all records with Last_Name that has at least one character that is not an alpha a-z, a space, or a hyphen.

  • Thank you very much for your help on this query! I really do appreciate it!

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply