Cursors Be Gone!

  • RBarryYoung (1/3/2009)


    Jeff Moden (1/3/2009)


    Just a hint about speed... I've found that SQL Server and I both have a couple of things in common... we're both lazy and neither of us can remember more than a couple of things at a time. 😛 I knew the split code to dig the SessionID out of the cookie was going to require multiple CharIndexes (ie, more than 2) and a whole bunch of other hooie if I tried to do it all in the same Select...

    Thanks, that does make sense. It just didn't occur to me that that was the source of the slowness.

    Heh... Maybe I'm not as lazy as I said... I did about 40 hours of some really dedicated testing, a couple of years ago, on the impact on performance that the "Divide and Conquer" methods of using derived tables as intermim storage have. I don't have the code for the tests, anymore, but what I found was amazing... the sum of the parts is frequently less than [font="Arial Black"]any [/font]of the separate parts. Where you and I look at it and say "Yep, there's the derived table and there's the outer query", the optimizer says, "Hmmm... nice, fast, single query".

    I don't even really think about such things, anymore and was actually a bit surprised when you pointed out that it was the load speed that impressed you the most. I just do them. BWAAA-HAAA!!! It's like the analogy you've seen me whip on people, many times. It's like practicing the piano... and I never practice hitting the wrong notes like writing a cursor because of a time crunch. Some of those people should listen to me instead of justifying writing the "occasional cursor" where "speed doesn't matter". 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I take is that @PrevKeep variable is redundant as it doesn't seem to be used?

    BWAA-HAAA!!! Well, at least I know someone read the code. Yes sir, I made a final change to the code and forgot to delete it from the declaration section of the code.

    It is still also in the code but not used after it has been set:

    SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID

  • RBarryYoung (1/3/2009)


    You know, I'm not sure how much point there is to this particular exercise. 90% of the time is in the intiial scanning of the WebLogEvents table and as there is not clustered index and indeed, no relevant index at all, there is no way to avoid scanning the entire table every time.

    I was just going back though the thread to make sure I didn't miss anything... I didn't see this one before. I'm thinking you pretty much agree that the dark side of the Force took care of this one. :alien:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • jacroberts (1/3/2009)


    I take is that @PrevKeep variable is redundant as it doesn't seem to be used?

    BWAA-HAAA!!! Well, at least I know someone read the code. Yes sir, I made a final change to the code and forgot to delete it from the declaration section of the code.

    It is still also in the code but not used after it has been set:

    SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID

    One of the tricks about doing this type of update, is having a bit of a "dummy anchor". I'd forgotten about that particular one... (heh, I can only remember 1 thing at a time, gettin' old) and it likely needs to stay as is. Might work without it, but might cause a hidden error in the data in the future if you take it out. One of the SETs must really has to have the form of SET @variable = column = expression for it to work realiably... might as well be this one.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (1/3/2009)


    jacroberts (1/3/2009)


    I take is that @PrevKeep variable is redundant as it doesn't seem to be used?

    BWAA-HAAA!!! Well, at least I know someone read the code. Yes sir, I made a final change to the code and forgot to delete it from the declaration section of the code.

    It is still also in the code but not used after it has been set:

    SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID

    One of the tricks about doing this type of update, is having a bit of a "dummy anchor". I'd forgotten about that particular one... (heh, I can only remember 1 thing at a time, gettin' old) and it likely needs to stay as is. Might work without it, but might cause a hidden error in the data in the future if you take it out. One of the SETs must really has to have the form of SET @variable = column = expression for it to work realiably... might as well be this one.

    This method sounds like a bit of a dark art.

  • jacroberts (1/3/2009)


    This method sounds like a bit of a dark art.

    Oh, it absolutely is... one that is condemned by many "experts" who have yet to get it to break. But one extra pinch of "Sand People Toe Nail" or leave out even 1 hair of a Jedi eye-lid, and it will burn you. However, do it all right and absolutely nothing can touch it for performance and it'll never break in the face of changing data.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff

    You are intensely cool. 1.796. And I'll agree with what was said above - i am most impressed by the way you got the data out of that table. Ultimate respect.

    One thing though - being the sneaky chap that I am, I replaced the grab from the table with your code in my solution - and ended up with the following. This one I got to run in 1.046. Can you check it on your box and let me know?

    DECLARE @Date datetime

    SET @Date = '20081220'

    DECLARE @ASPString nvarchar(100)

    DECLARE @Len int

    DECLARE @DateMinus1 datetime

    DECLARE @ElevenThirty datetime

    SET @ElevenThirty = '1899-12-30 23:30:00.000'

    SET @ASPString = 'ASP.NET_SessionId='

    SET @Len = Len(@ASPString)

    SET @Date = Convert(varchar, @Date, 112)

    SET @DateMinus1 = DateAdd(dd, -1, @Date)

    CREATE TABLE #sessions(ID int IDENTITY(1,1) primary key clustered, EventMinuteIndex smallint, ASPSessionID char(40))

    INSERT INTO #sessions (EventMinuteIndex, ASPSessionID)

    SELECT EventMinuteIndex, ASPSessionID FROM

    (

    SELECT CAST(SUBSTRING(Cookie,1,CHARINDEX(';',Cookie)-1) AS VARCHAR(50)) AS ASPSessionID,

    datediff(minute, @date, TheDate+2.0) AS EventMinuteIndex

    FROM (--==== Subquery does some of the calcs and isolates the start of the SessionID

    SELECT Date+Time AS TheDate,

    SUBSTRING([cs(Cookie)],CHARINDEX('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie,

    Date

    FROM dbo.WebLogEvents

    WHERE Date = @Date

    OR (Date = @Date-1 AND Time >= '1899-12-30 23:30')

    )p1

    -- SELECT datediff(minute, '18991230', [Time]) as EventMinuteIndex, ASPSessionId

    -- FROM dbo.WebLogEventsNormalised WITH (NOLOCK)

    -- WHERE Date = @date

    -- UNION ALL

    -- SELECT datediff(minute, '18991230', [Time]) - 1440 as EventMinuteIndex, ASPSessionID

    -- FROM dbo.WebLogEventsNormalised WITH (NOLOCK)

    -- WHERE Date = @DateMinus1 AND [Time] >= @ElevenThirty

    ) itbl

    ORDER BY ASPSessionID, EventMinuteIndex

    SELECT @date as Date, hr, fm, count(*) FROM

    (

    select eventminuteindex / 60 as hr, (eventminuteindex / 5) % 12 as fm FROM

    (

    SELECT s1.eventminuteindex,

    CASE WHEN s1.eventminuteindex - s2.eventminuteindex > 30

    THEN 1

    ELSE 0

    END as enoughTimePassed,

    CASE WHEN s1.aspsessionid = s2.aspsessionid

    THEN 0

    ELSE 1

    END as SessionDifferent

    FROM #sessions s1 INNER JOIN

    #sessions s2

    on s1.ID = s2.ID + 1

    UNION ALL

    -- union the first row in, as the above join will never pick it up

    SELECT s1.eventminuteindex, 1 as enoughTimePassed, 1 as SessionDifferent

    FROM #sessions s1 where id = 1

    ) itbl

    WHERE EventMinuteIndex >= 0

    and (EnoughTimePassed + SessionDifferent) > 0

    ) otbl

    GROUP BY hr, fm

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Phil Factor (1/3/2009)


    The other thing I missed was the WITH FILLFACTOR = 100. Neat!

    Erm... I think i'm missing something. Isn't 100 the default fillfactor for any index where a fillfactor is not specified?

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Ok I guess you peeps have hit the hay 🙂

    @jacroberts - can you include that last one in your test runs monday? I know it's cheating, but I think coming up with the best performing solution is the most informative for all of us 🙂

    @jeff - hope you haven't taken offence - certainly none was intended!

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Matt Whitfield (1/3/2009)


    Phil Factor (1/3/2009)


    The other thing I missed was the WITH FILLFACTOR = 100. Neat!

    Erm... I think i'm missing something. Isn't 100 the default fillfactor for any index where a fillfactor is not specified?

    Nope... unless you changed it for your server, the default Fill Factor is 90.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Matt Whitfield (1/3/2009)


    Jeff

    You are intensely cool. 1.796. And I'll agree with what was said above - i am most impressed by the way you got the data out of that table. Ultimate respect.

    One thing though - being the sneaky chap that I am, I replaced the grab from the table with your code in my solution - and ended up with the following. This one I got to run in 1.046. Can you check it on your box and let me know?

    On my box, I got a little over 6 seconds with the clustered index on the event's table. So, on my box, your code took 2 seconds longer and on your box, my code took longer.

    Phil Factor and I have run into that a couple of times together. He and I wrote a split function that would split the novel Moby Dick at the word level. His code took relatively long on my box and my code absolutely smoked his. The exact opposite was true on his box. That's why "tuning" is such a "dark art"... there is no real black and white way of doing it. You have to "feel the machine" and "use the Force"... 😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Matt Whitfield (1/3/2009)


    @jacroberts - can you include that last one in your test runs monday? I know it's cheating, but I think coming up with the best performing solution is the most informative for all of us 🙂

    Yes of course. It's not cheating it's collaborating.

    It might be better if I do the test on Tuesday rather than Monday as the site isn't that busy on a Sunday but it is busy on a Monday so there will be better data to test them on.

  • Jeff Moden (1/3/2009)


    Nope... unless you changed it for your server, the default Fill Factor is 90.

    Are you sure? On all the servers i've found, i get this result set for sp_configure 'fill factor'

    name,minimum,maximum,config_value,run_value

    fill factor (%),0,100,0,0

    http://msdn.microsoft.com/en-us/library/ms177459.aspx says 0 and 100 are the same in all respects - and http://msdn.microsoft.com/en-us/library/ms191005(SQL.90).aspx says the default is 0. Interestingly though, http://msdn.microsoft.com/en-us/library/ms191005.aspx (the sql 2008 version) omits the part about the default text...

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Matt Whitfield (1/3/2009)


    Ok I guess you peeps have hit the hay 🙂

    @jacroberts - can you include that last one in your test runs monday? I know it's cheating, but I think coming up with the best performing solution is the most informative for all of us 🙂

    @jeff - hope you haven't taken offence - certainly none was intended!

    Nope... but getting ready to hit the hay early for once. Had to throw some groceries down my neck and do a couple of chores.

    Combining code to come up with the best solution possible is never cheating. And, heck no... I've certainly taken no offense... borrowing someones code is a high form of flattery. Thanks for the compliment.

    I'll be really interested to see how things work out... thanks for any testing you do, JacRobert's. And, thanks for the really interesting problem.

    What's really cool about this whole thing is that there have been several solutions offered up rather quickly that just absolutely blow the doors off the cursor and/or while loop methods. And it's not just the methods to solve the problem that have risen to the top... some of the support code that works "behind the scenes" has shown some new possibilities, as well.

    Well done to all who have participated, right or wrong, in this extraordinary thread.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (1/3/2009)


    On my box, I got a little over 6 seconds with the clustered index on the event's table. So, on my box, your code took 2 seconds longer and on your box, my code took longer.

    Phil Factor and I have run into that a couple of times together. He and I wrote a split function that would split the novel Moby Dick at the word level. His code took relatively long on my box and my code absolutely smoked his. The exact opposite was true on his box. That's why "tuning" is such a "dark art"... there is no real black and white way of doing it. You have to "feel the machine" and "use the Force"... 😛

    That pretty much equates to my experience (although I get the distinct feeling mine is limited compared to yours). Every time you think 'yes that is definitely the way to do it' you come across the exception which makes the rule immediately binnable.

    It's been a pleasure sir. Can't wait to see how these perform on jacroberts' DB 🙂 My gut feeling is that your one will probably win out - if the difference is to do with limited resources (you were saying your box was old) then i'd expect that a much enlarged data set would show up that as a factor... Who knows? 🙂

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

Viewing 15 posts - 136 through 150 (of 272 total)

You must be logged in to reply to this topic. Login to reply