One-step data scrubbing

  • I'm working on a script to clean up data in an old access table before importing to SQL Server. I want to know if it is possible to apply the CONVERT function to the result of a CASE statement in a single step.

    I have this CASE statement to change a string field with abbreviated dates (Dec. 25, 1990, e.g.) to full text date (December 25, 1990). With the data in this format I can CONVERT to a smalldatetime data type.

    Here is the CASE statement:

    select 

    CASE

     WHEN LEFT(birth_date, 3) = 'dec' then replace(birth_date,'dec.','December')

     WHEN LEFT(birth_date, 4) = 'feb.' then replace(birth_date,'feb.','February')

     WHEN LEFT(birth_date, 3) = 'feb' then replace(birth_date,'feb','February')

     WHEN LEFT(birth_date, 3) = 'aug' then replace(birth_date,'aug.','August')

     WHEN LEFT(birth_date, 4) = 'oct.' then replace(birth_date,'oct.','October')

     WHEN LEFT(birth_date, 3) = 'oct' then replace(birth_date,'oct','October')

     WHEN LEFT(birth_date, 3) = 'sep' then replace(birth_date,'sept.','September')

     WHEN LEFT(birth_date, 4) = 'jan.' then replace(birth_date,'jan.','January')

     WHEN LEFT(birth_date, 3) = 'jan' then replace(birth_date,'jan','January')

     WHEN LEFT(birth_date, 3) = 'nov' then replace(birth_date,'nov.','November')

     WHEN LEFT(birth_date, 3) = 'mar' then replace(birth_date,'mar.','March')

    else birth_date

    end as birthdate

    from tracking

    I've tried using the entire CASE statement as the expression in the CONVERT function, but that won't work.

    Any suggestions would be appreciated.

     

    Thanks

    Tim

  •  

    You may be able to avoid the case statement altogether. Try:

    select DateName(mm,birth_date)

     

    Hope this helps,

    Pete

  • It will depend on your setup but if you are the default setup for the English version then this should do the trick.

    SELECT datepart(mm,cast(replace(birth_date,'.','') as smalldatetime))

    Note: 'dec 12,2004' is understood as 12/12/2004. But 'dec. 12, 2004' will cause an issue. As long as all are the standard abbreviations you should be fine except needing to drop the '.'.

  • Antares666:

    This works OK except for the case where September is abbreviated as Sept. That leaves Sept as the month string which will not convert (as per your note). it looks from the data like this is the only case. What is the best way to handle this exception?

    TIA

    Tim

  • Try this then:

    SELECT datepart(mm,cast(replace(birth_date,substring(birth_date,4,charindex(' ',birth_date) - 4) + ' ',' ') as smalldatetime))

    It will make sure the month portion is 3 characters only which SQL Server reconizes the following

    Jan,Feb,Mar,Apr,May,Jun,Jul,Aug,Sep,Oct,Nov,Dec

    for English setting. Fotunately none are more tha the first 3 characters and should means meets your needs.

    (Note: syslanguages is where this is found.)

     

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply