How will you write a query for this scenario

  • Jeff Moden (8/7/2011)


    Ben-Gan has some very interesting date-span code that runs like the wind. It would be perfect for this and actually negates the need for a Quirky Update. In fact, it beats the Quirky Update for performance... heh... yeah, I tried. 😀

    Nah... I just double checked the original post on this thread. Ben-Gan's date-span code won't quite to it here.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • eccentricDBA (8/2/2011)


    Here is a technique. It would need to be tested for usuability with a large record set.

    Done... and I think there may be a problem... it would appear that the code will only find the first 3 occurances.

    -- Define Table

    DECLARE @Table AS TABLE (PatientID INT, DivDate DATETIME);

    -- Populate Table

    INSERT INTO @Table (PatientID, DivDate)

    SELECT 380,'2010-01-09' UNION ALL

    SELECT 380,'2010-01-11' UNION ALL

    SELECT 380,'2010-01-25' UNION ALL

    SELECT 380,'2010-01-26' UNION ALL

    SELECT 380,'2010-02-13' UNION ALL

    SELECT 380,'2010-02-21' UNION ALL

    SELECT 380,'2010-03-25' UNION ALL

    SELECT 380,'2010-04-12' UNION ALL

    SELECT 380,'2010-04-19' UNION ALL

    SELECT 380,'2010-04-22' UNION ALL

    SELECT 380,'2010-05-03' UNION ALL

    SELECT 380,'2010-05-17' UNION ALL

    SELECT 380,'2010-05-21' UNION ALL

    SELECT 380,'2010-06-05' UNION ALL

    SELECT 380,'2010-06-10' UNION ALL

    SELECT 380,'2010-06-13' UNION ALL

    SELECT 380,'2010-06-19' UNION ALL

    SELECT 380,'2010-07-03' UNION ALL

    SELECT 380,'2010-07-03' UNION ALL

    SELECT 380,'2010-07-19' UNION ALL

    SELECT 380,'2010-08-28' UNION ALL

    SELECT 380,'2010-09-01' UNION ALL

    SELECT 380,'2010-10-10' UNION ALL

    SELECT 380,'2010-10-20' UNION ALL

    SELECT 380,'2010-10-28' UNION ALL

    SELECT 380,'2010-11-09' UNION ALL

    SELECT 380,'2010-11-11' UNION ALL

    SELECT 380,'2010-11-24' UNION ALL

    SELECT 380,'2010-12-01' UNION ALL

    SELECT 380,'2010-12-12' UNION ALL

    SELECT 380,'2010-12-17' UNION ALL

    SELECT 380,'2010-12-22' UNION ALL

    SELECT 380,'2010-12-23' UNION ALL

    SELECT 380,'2011-01-19' UNION ALL

    SELECT 380,'2011-02-02' UNION ALL

    SELECT 380,'2011-02-11' UNION ALL

    SELECT 380,'2011-02-13' UNION ALL

    SELECT 380,'2011-03-04' UNION ALL

    SELECT 380,'2011-03-09' UNION ALL

    SELECT 380,'2011-03-11' UNION ALL

    SELECT 380,'2011-03-22' UNION ALL

    SELECT 380,'2011-03-25' UNION ALL

    SELECT 380,'2011-03-29' UNION ALL

    SELECT 380,'2011-04-01' UNION ALL

    SELECT 380,'2011-04-24' UNION ALL

    SELECT 380,'2011-04-25' UNION ALL

    SELECT 380,'2011-05-27' UNION ALL

    SELECT 380,'2011-06-02' UNION ALL

    SELECT 380,'2011-07-16' UNION ALL

    SELECT 380,'2011-07-17' UNION ALL

    SELECT 380,'2011-07-19' UNION ALL

    SELECT 380,'2011-07-29' UNION ALL

    SELECT 380,'2011-08-17' UNION ALL

    SELECT 380,'2011-08-27' UNION ALL

    SELECT 380,'2011-08-27' UNION ALL

    SELECT 380,'2011-09-10' UNION ALL

    SELECT 380,'2011-10-08' UNION ALL

    SELECT 380,'2011-11-18' UNION ALL

    SELECT 380,'2011-12-17' UNION ALL

    SELECT 380,'2011-12-19' UNION ALL

    SELECT 380,'2011-12-27' UNION ALL

    SELECT 380,'2011-12-27' UNION ALL

    SELECT 380,'2012-01-14' UNION ALL

    SELECT 380,'2012-01-29' UNION ALL

    SELECT 380,'2012-02-16' UNION ALL

    SELECT 380,'2012-02-20' UNION ALL

    SELECT 380,'2012-03-05' UNION ALL

    SELECT 380,'2012-03-08' UNION ALL

    SELECT 380,'2012-03-22' UNION ALL

    SELECT 380,'2012-04-12' UNION ALL

    SELECT 380,'2012-04-20' UNION ALL

    SELECT 380,'2012-04-30' UNION ALL

    SELECT 380,'2012-05-21' UNION ALL

    SELECT 380,'2012-05-24' UNION ALL

    SELECT 380,'2012-06-03' UNION ALL

    SELECT 380,'2012-07-14' UNION ALL

    SELECT 380,'2012-07-20' UNION ALL

    SELECT 380,'2012-07-31' UNION ALL

    SELECT 380,'2012-08-02' UNION ALL

    SELECT 380,'2012-08-06' UNION ALL

    SELECT 380,'2012-08-07' UNION ALL

    SELECT 380,'2012-08-19' UNION ALL

    SELECT 380,'2012-09-09' UNION ALL

    SELECT 380,'2012-09-15' UNION ALL

    SELECT 380,'2012-10-04' UNION ALL

    SELECT 380,'2012-10-08' UNION ALL

    SELECT 380,'2012-10-10' UNION ALL

    SELECT 380,'2012-10-23' UNION ALL

    SELECT 380,'2012-11-01' UNION ALL

    SELECT 380,'2012-11-03' UNION ALL

    SELECT 380,'2012-11-27' UNION ALL

    SELECT 380,'2012-12-03' UNION ALL

    SELECT 380,'2012-12-12' UNION ALL

    SELECT 380,'2012-12-18' UNION ALL

    SELECT 380,'2012-12-18'

    ;

    WITH PatientSchedule AS

    (SELECT PatientID, Min(DivDate) FirstVisit, Min(DivDate) + 30 SecondVisit, Min(DivDate) + 60 ThirdVisit, Min(DivDate) + 90 FourthVisit

    FROM @Table

    GROUP BY PatientID)

    SELECT T.PatientID, Min(T.DivDate)

    FROM @Table as t

    JOIN PatientSchedule as PS ON PS.PatientID = t.PatientID

    WHERE T.DivDate BETWEEN PS.FirstVisit AND PS.SecondVisit

    GROUP BY T.PatientID

    UNION

    SELECT T.PatientID, Min(T.DivDate)

    FROM @Table as t

    JOIN PatientSchedule as PS ON PS.PatientID = t.PatientID

    WHERE T.DivDate BETWEEN PS.SecondVisit AND PS.ThirdVisit

    GROUP BY T.PatientID

    UNION

    SELECT T.PatientID, Min(T.DivDate)

    FROM @Table as t

    JOIN PatientSchedule as PS ON PS.PatientID = t.PatientID

    WHERE T.DivDate BETWEEN PS.ThirdVisit AND PS.FourthVisit

    GROUP BY T.PatientID;

    ;

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • gah (8/5/2011)


    Another possibility using quirky update.......thoughts anyone please?

    I just checked through your QU code, Gah... VERY nicely done! It's not often I see someone follow all the rules especially with the new "safety check". 🙂

    ...and the addition of the number of visits is brilliant. :w00t:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • GilaMonster (8/7/2011)


    I had to do something like that for a client a couple years back. The eventual solution was partially set-based, with a while loop tossed in and came out at almost 2 pages of T-SQL by the end.

    It's not an easily-solved problem, especially if the data is 'interesting' ...

    It's a bit of a shame that you don't trust in the Quirky Update, Gail, especially when you write it to a temp table where you have absolute control and especially since the new "safety check" code is in place. The Quirky Update sure made this problem easy.

    If you still don't trust it (and I don't actually expect you ever will :-)), a simple cursor or While loop would have done it using just a bit more code than the QU did. Sure, it would have take a fair bit longer than the QU but it is documented and supported.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (8/7/2011)


    vermarajiv (8/2/2011)


    Thanks GSquared for confirming what I was thinking.

    I tried recursive CTE and you are right it does not allow aggregate functions

    Based on your tip, I found a semi set based solution in a loop instead of using the cursor.

    EccentricDBA - thanks for your effort to help me on this.

    Thanks

    RK

    Please... it's a 2 way street here. Please post the solution you found so that we may all learn. Thanks. 😉

    And do take a look at the solution that Gah posted... I could be wrong but I'm pretty sure that it'll blow the proverbial doors off of anything that uses a loop for this problem.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • 7 straights posts Jeff.

    The man's on fire :-D.

  • Ninja's_RGR'us (8/7/2011)


    7 straights posts Jeff.

    The man's on fire :-D.

    Heh... I can't wait for someone to pass me for the number of posts so you'll stop riding my donkey, Remi. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (8/7/2011)


    Ninja's_RGR'us (8/7/2011)


    7 straights posts Jeff.

    The man's on fire :-D.

    Heh... I can't wait for someone to pass me for the number of posts so you'll stop riding my donkey, Remi. 😉

    Go Girl.... ain't gonna be me.

    BTW I SIT on stanta's lap, I don't ride his back!

  • Jeff Moden (8/7/2011)


    GilaMonster (8/7/2011)


    I had to do something like that for a client a couple years back. The eventual solution was partially set-based, with a while loop tossed in and came out at almost 2 pages of T-SQL by the end.

    It's not an easily-solved problem, especially if the data is 'interesting' ...

    It's a bit of a shame that you don't trust in the Quirky Update, Gail, especially when you write it to a temp table where you have absolute control and especially since the new "safety check" code is in place. The Quirky Update sure made this problem easy.

    I'm glad you know more about the specific problem that I had than I did. 😉

    If you still don't trust it (and I don't actually expect you ever will :-)), a simple cursor or While loop would have done it using just a bit more code than the QU did. Sure, it would have take a fair bit longer than the QU but it is documented and supported.

    In the specific problem I had, it would have taken at least 3 passes with a cursor (as I said, the data was interesting, there were 4 datetime fields involved). The set-based portion (which was the majority of the code) took care of 99% of cases, the while loop was for the few exceptions that needed additional care.

    The simple case of this, yeah a straightforward quirky update or cursor can handle.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Sorry guys, did not realize that there were so many posts going back and forth on this thread. See below for the solution that I came up with. May not be the most elegant one, but it worked for me. Since I was using the temp tables at this stage of the procedure, deleting records was not a problem for me. Your situation may be different.

    Just to give you an overview, I am taking the min date from the base table, adding 30 days and storing that value in another column in a different table. Then I go back to the original table and delete all the records that fall within that 30 days. I repeat this process until the table is empty.

    -- Goal is to get min(Date) for 30 day period starting from the min date.

    -- Base data table

    Create table #patient_DivDates (patientId int, divDates datetime)

    INSERT INTO #patient_DivDates values (1, '2/1/2011')

    INSERT INTO #patient_DivDates values (1, '2/10/2011')

    INSERT INTO #patient_DivDates values (1, '6/5/2011')

    INSERT INTO #patient_DivDates values (1, '6/30/2011')

    INSERT INTO #patient_DivDates values (1, '7/8/2011')

    INSERT INTO #patient_DivDates values (2, '3/1/2011')

    INSERT INTO #patient_DivDates values (2, '5/3/2011')

    -- Final result in this table.

    CREATE TABLE #result (patientId int, DivDates datetime, ThirtyDaysOut datetime)

    -- ThirtyDaysOut is the min Date + 30 days

    while exists (select patientId from #patient_DivDates)

    BEGIN

    -- Get the min date record

    INSERT INTO #result (PatientId, DivDates, ThirtyDaysOut)

    select patientId, MIN(DivDates) ips_diversion_date,

    DATEADD(dd, 30, MIN(DivDates)) ThirtyDaysOut

    from #patient_DivDates

    group by patientId

    -- Delete the records that fall within 30 day range

    DELETE #patient_DivDates

    from #patient_DivDates

    inner join #result on #patient_DivDates.patientId = #result.patientId

    and #patient_DivDates.divDates between #result.DivDates and #result.ThirtyDaysOut

    END

    select * from #result order by patientId, DivDates

    Depending on how large your result set is and how many records will be deleted in each iteration, performance will vary.

  • vermarajiv (8/8/2011)

    Depending on how large your result set is and how many records will be deleted in each iteration, performance will vary.

    Hi RK...thanks for posting your solution...and as you say performance will vary dependent upon number of records

    did you have an opportunity to test my solution?

    regards gah

    ________________________________________________________________
    you can lead a user to data....but you cannot make them think
    and remember....every day is a school day

  • Hi Gah,

    I have not tested your solution yet, but I will as soon as I get a chance. I am working on tight deadlines so I am getting less time to play around. I will let you know as soon as I can.

    Thanks

    RK

  • GilaMonster (8/8/2011)


    Jeff Moden (8/7/2011)


    GilaMonster (8/7/2011)


    I had to do something like that for a client a couple years back. The eventual solution was partially set-based, with a while loop tossed in and came out at almost 2 pages of T-SQL by the end.

    It's not an easily-solved problem, especially if the data is 'interesting' ...

    It's a bit of a shame that you don't trust in the Quirky Update, Gail, especially when you write it to a temp table where you have absolute control and especially since the new "safety check" code is in place. The Quirky Update sure made this problem easy.

    I'm glad you know more about the specific problem that I had than I did. 😉

    If you still don't trust it (and I don't actually expect you ever will :-)), a simple cursor or While loop would have done it using just a bit more code than the QU did. Sure, it would have take a fair bit longer than the QU but it is documented and supported.

    In the specific problem I had, it would have taken at least 3 passes with a cursor (as I said, the data was interesting, there were 4 datetime fields involved). The set-based portion (which was the majority of the code) took care of 99% of cases, the while loop was for the few exceptions that needed additional care.

    The simple case of this, yeah a straightforward quirky update or cursor can handle.

    No need for the irony, Gail. :blush: It wasn't meant as a personal attack. Your original description fell short of what you actually had to go through and I know you don't care for the QU. Since I was operating under what you said ("had to do something like that") and the QU greatly simplified this problem, I was (still) trying to sway you into reconsidering it's use. 😉

    Shifting gears, do you remember how you solved the set-based portion of this type of problem? It would be interesting to see other methods other than the QU for this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (8/8/2011)


    Since I was operating under what you said ("had to do something like that") and the QU greatly simplified this problem, I was (still) trying to sway you into reconsidering it's use. 😉

    I did offer the client the option - for the while loop portion - to use the quirky update. Told him it would be faster than the while (time was critical), but depended on an undocumented behaviour of the query execution engine and hence there's a small change that future versions could break the code.

    Client decided while loop, subject to a complete time-test. We had a 10 hour window to process something like 50 million rows (and this piece was just one smallish section). The whole thing, end to end, on a rather old server came out in 35 minutes (and the system that we were hoping to replace took 2 weeks)

    Shifting gears, do you remember how you solved the set-based portion of this type of problem? It would be interesting to see other methods other than the QU for this.

    Vaguely, but I can't post table defs or code (and anonymising it will take forever and a day) as it was custom, confidential work. Client owns code. Medical aid claims processing.

    In short, multiple row_numbers, one for each of the dates I was dealing with. Temp tables (because I was making multiple passes through the table = not fun when I only wanted a small portion of the table each time. Temp table self joins (on offsets of the row numbers and a couple other things that I can't recall now), three of these, with results used to update the temp tables, then an aggregate query to get the group information out. There were a couple more cleanup passes as well, to deal with some odder cases with the dates). The offset joins were the majority of the processing for this.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yowch! 2 WEEKS for the original code? Heh... talk about your basic long running queries. That really does fall in the category of "What were they thinking"? And you got it down to 35 minutes on older hardware. Nicely done.

    You should write a bit of a case study on it. I'd be one to cite the study everytime someone makes the suggestion that the "best" way to solve performance problems is by buying "better" hardware.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 16 through 30 (of 32 total)

You must be logged in to reply to this topic. Login to reply