SQL Code Help

  • Yep I agree with you Drew, that's why I asked how many columns were involved, if it was only a couple it might have been feasible.

    What about this way, keeping each case seperate and concatenating the results:

    SELECT Emp_No,

    HR_ID,

    ISNULL(Emp_No_Err,'') + ', ' + ISNULL(HR_ID_Err,'') + ', ' + ISNULL(Emp_Name_Err,'') AS Errors

    FROM

    (SELECT Emp_No,

    HR_ID,

    CASE WHEN LEN(Emp_No) < 7 THEN 'Emp_No: Too Short' END AS Emp_No_Err,

    CASE WHEN LEN(HR_ID) < 7 THEN 'HR_ID: Too Short' END AS HR_ID_Err,

    CASE WHEN Emp_Name = 'Alfred' THEN 'Emp_Name: Alfred' END AS Emp_Name_Err

    FROM

    (SELECT 12345 AS Emp_No, 'abcdef' AS HR_ID, 'Keith' AS Emp_Name) testData

    ) errorData

    SQL SERVER Central Forum Etiquette[/url]

  • Sorry, duplicated my post by accident. Will try and delete.

    SQL SERVER Central Forum Etiquette[/url]

  • Thanks Jim, I will try out your code .. I had a basic CASE statement structure ..

  • aarionsql (12/5/2011)


    HI Drew,

    can I have a join in a CASE statement?

    No, but you can have a correlated subquery, a (NOT) IN with a subquery, or a (NOT) EXISTS with a subquery in a CASE function. Depending on your conditions and how often the table is referenced, you may want to use a join or apply in the main query instead anyhow.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

    SELECT

    Emp_No,

    HR_ID,

    ErrorChecker.ErrorMsg

    FROM

    (SELECT 12345 AS Emp_No, 'abcdef' AS HR_ID) testData

    CROSS APPLY (

    SELECT

    d.CommonError,

    ErrorMsg = CASE WHEN d.CommonError = 1 THEN 'Emp_No: Too Short, HR_ID: Too Short ' ELSE '' END +

    CASE WHEN d.Emp_NoError = 2 THEN 'Emp_No: Too Short ' ELSE '' END +

    CASE WHEN d.HR_IDError = 3 THEN 'HR_ID: Too Short ' ELSE '' END

    FROM (

    SELECT

    CommonError = CASE

    WHEN LEN(Emp_No) < 7 AND LEN(HR_ID) < 7 THEN 1 ELSE NULL END,

    Emp_NoError = CASE

    WHEN LEN(Emp_No) < 7 AND LEN(HR_ID) > 6 THEN 2 ELSE NULL END,

    HR_IDError = CASE

    WHEN LEN(HR_ID) < 7 AND LEN(HR_ID) > 6 THEN 3 ELSE NULL END

    ) d

    ) ErrorChecker

    Oh, and it's fast, too.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (12/5/2011)


    You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

    I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • CELKO (12/5/2011)


    Your approach to SQL is fundamentally wrong.

    Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (12/5/2011)


    CELKO (12/5/2011)


    Your approach to SQL is fundamentally wrong.

    Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.

    Drew

    He's been told that 1000 times already. I guess he's too old or stuburn to learn anything by now.

    Too bad for all of us.

  • drew.allen (12/5/2011)


    CELKO (12/5/2011)


    Your approach to SQL is fundamentally wrong.

    Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.

    Drew

    Pffft, it's actually just spam advertising. All he does is make sure his booklist has a higher hitrate for google spiders. Wonder if we should start reporting his posts that way... 'eh, Steve might not love us.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Evil Kraig F (12/5/2011)


    drew.allen (12/5/2011)


    CELKO (12/5/2011)


    Your approach to SQL is fundamentally wrong.

    Your approach to SQL training is fundamentally wrong. Without specifics on what you think is fundamentally wrong, why you think it is fundamentally wrong, and how to fix it, your statement only serves to disparage the original poster. People don't learn well when they're being attacked.

    Drew

    Pffft, it's actually just spam advertising. All he does is make sure his booklist has a higher hitrate for google spiders. Wonder if we should start reporting his posts that way... 'eh, Steve might not love us.

    He'd like the feedback and conversation on it.

    Of course JC = spam 50% of the time.

  • drew.allen (12/5/2011)


    ChrisM@Work (12/5/2011)


    You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

    I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.

    Drew

    Ouch! Having a bad day, Drew? ๐Ÿ˜›

    Sure it's not the best example, but the point - as stated - was to open up possibilities.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (12/6/2011)


    drew.allen (12/5/2011)


    ChrisM@Work (12/5/2011)


    You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

    I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.

    Drew

    Ouch! Having a bad day, Drew? ๐Ÿ˜›

    Sure it's not the best example, but the point - as stated - was to open up possibilities.

    It's just inefficient. You have to touch each row twice: once to assign the code and once to assign the actual error message. Sometimes it can be worth trading off efficiency for a gain somewhere else, but I don't see where you're gaining anything here, so you're just sacrificing efficiency.

    It also makes your code harder to read, and therefore update, because of the extra layer.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (12/6/2011)


    ChrisM@Work (12/6/2011)


    drew.allen (12/5/2011)


    ChrisM@Work (12/5/2011)


    You might get one or two ideas from the following. It's proving very usable for data cleansing, which is similar to your scenario - you want to identify a case and do something based upon it. Separating the detection from the action opens up a whole new world of possibilities.

    I don't see how you gain anything by calculating an intermediate code which you then use to calculate the error message instead of directly calculating the error message. It might be worthwhile if the same calculation is used in many different places, but that's not happening in your sample code.

    Drew

    Ouch! Having a bad day, Drew? ๐Ÿ˜›

    Sure it's not the best example, but the point - as stated - was to open up possibilities.

    It's just inefficient. You have to touch each row twice: once to assign the code and once to assign the actual error message. Sometimes it can be worth trading off efficiency for a gain somewhere else, but I don't see where you're gaining anything here, so you're just sacrificing efficiency.

    It also makes your code harder to read, and therefore update, because of the extra layer.

    Drew

    Point taken.

    Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?

    There's another benefit from this approach too. You can output the detection codes. When you're checking the output for discrepancies you can see which codes are faulty. Trying to figure out which CASE is at fault without this can be extraordinarily difficult. Last but not least - outputting the detection codes allows you to measure which ones are most frequently encountered and put them at the top of the CASE construct, thereby reducing the number of unnecessary checks.

    Also - you're not touching the row twice. You're still touching it throughout detection and action.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (12/6/2011)


    Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?

    Actually, in that situation, I suspect that an OUTER APPLY might actually be more efficient and would almost certainly be easier to read. Something like the following:

    SELECT *

    FROM YourDirtyDataTable

    OUTER APPLY (

    SELECT 1 AS ErrorCode1, 'Your specific error' AS ErrorMessage1, 2 AS ErrorCode2

    , 'Another error message here' AS ErrorMessage2, <some expression> as Title

    , <some other expression> as FirstName

    WHERE <your complex detection expression here>

    ) AS e1

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • drew.allen (12/6/2011)


    ChrisM@Work (12/6/2011)


    Now imagine you're writing a query to output cleaned title, firstname, lastname from a table of appallingly bad data. You've set up a shedload of detection and cleaning rules. Some of those cleaning rules will apply to only one column, so a plain CASE WHEN ... THEN ... would be fine. Many however will apply to multiple columns - a single detection rule will dictate changes to both title and firstname, for instance. Why run the expensive detection CASE twice, once for each column?

    Actually, in that situation, I suspect that an OUTER APPLY might actually be more efficient and would almost certainly be easier to read. Something like the following:

    SELECT *

    FROM YourDirtyDataTable

    OUTER APPLY (

    SELECT 1 AS ErrorCode1, 'Your specific error' AS ErrorMessage1, 2 AS ErrorCode2

    , 'Another error message here' AS ErrorMessage2, <some expression> as Title

    , <some other expression> as FirstName

    WHERE <your complex detection expression here>

    ) AS e1

    Drew

    You're absolutely right - CROSS APPLY is well nice for this - if you exceed SQL Server's expressions limit, you can pop it seamlessly into an iTVF and voila! Error gone, code gone:-D

    We'll have to differ on this one Drew. I'm using the method as we speak having started with single CASE expressions, and switching to separate detection/effect has made this project tons easier and far far easier to read.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Viewing 15 posts - 16 through 29 (of 29 total)

You must be logged in to reply to this topic. Login to reply