Cursors Be Gone!

  • Site Visits Results

    Summary

    Three tests were run with different methods of calculating site visits.

    1. Using a normal cursor - jacroberts

    This took 29:36 to complete

    2. Using a "pseudo" cursor - Jeff Moden

    This took 27:34 to complete

    3. Using a join table to itself - Matt Whitfield

    This took 22:10 to complete

    Source Code

    Regular Cursor

    -- **************************************************************

    -- PROCEDURE SiteVisits

    -- Description: This function creates a table with a row for each site visit

    -- for a day.

    -- *******************************************************************

    CREATE PROCEDURE [dbo].[AggregateSiteVisits1]

    (

    @Date datetime

    )

    AS

    BEGIN

    SET NOCOUNT ON

    DECLARE @StartTime datetime

    SET @StartTime = GetDate()

    PRINT 'Start AggregateSiteVisits1 - jacroberts '

    + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    CREATE TABLE #Table

    (

    [Hour] tinyint, -- 0 to 23

    [FiveMinute] tinyint -- 0 to 11

    )

    DECLARE @ASPString nvarchar(100)

    DECLARE @Len int

    DECLARE @DateMinus1 datetime

    DECLARE @ElevenThirty datetime

    SET @ElevenThirty = '1899-12-30 23:30:00.000'

    SET @ASPString = 'ASP.NET_SessionId='

    SET @Len = Len(@ASPString)

    SET @Date = Convert(varchar, @Date, 112)

    SET @DateMinus1 = DateAdd(dd, -1, @Date)

    DECLARE @PrevSeconds int

    DECLARE @PrevSessionId nvarchar(50)

    SET @PrevSeconds = -100 --Initialise

    SET @PrevSessionId = 'xxxxxxxx' --Initialise

    DECLARE @CurrSeconds int

    DECLARE @CurrSessionId nvarchar(50)

    DECLARE @NewVisit bit

    DECLARE SessionCursor CURSOR LOCAL FORWARD_ONLY STATIC READ_ONLY FOR

    SELECT Cast(SubString(Cookie, 1, CharIndex(';',Cookie)-1) AS varchar(50)) AS ASPSessionID,

    DateDiff(ss, @date, TheDate+2.0) AS EventSecondIndex

    FROM (SELECT Date+Time AS TheDate,

    SubString([cs(Cookie)],CharIndex('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie

    FROM dbo.WebLogEvents WITH (NOLOCK)

    WHERE (Date = @Date

    OR (Date = @DateMinus1

    AND [Time] >= @ElevenThirty))

    AND [cs(Cookie)] LIKE '%ASP.NET_SessionId=%'

    ) AS X

    ORDER BY 1, 2

    PRINT 'Opening SessionCursor ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    OPEN SessionCursor

    PRINT 'Cursor parsing starting ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    FETCH NEXT -- Get the first values into the @Curr variables

    FROM SessionCursor

    INTO @CurrSessionId,

    @CurrSeconds

    DECLARE @RowCount int

    SET @RowCount = 0

    WHILE (@@Fetch_Status = 0)

    BEGIN

    SET @RowCount = @RowCount + 1

    SET @NewVisit = 0 --Initialise

    IF @CurrSeconds >= 0 -- It's for the day we are looking at date

    IF @CurrSessionId <> @PrevSessionId -- New @CurrSessionId

    BEGIN

    SET @NewVisit = 1

    END

    ELSE --It is the same session Id so test at least 30 mins since last logged

    BEGIN

    IF @CurrSeconds - @PrevSeconds >= 1800

    BEGIN

    SET @NewVisit = 1

    END

    --END IF

    END

    --END IF

    --END IF

    IF (@NewVisit = 1) --Insert a row into the table

    BEGIN

    INSERT INTO #Table

    (

    [Hour],

    [FiveMinute]

    )

    VALUES

    (

    Cast(@CurrSeconds / 3600 as tinyint),

    Cast((@CurrSeconds / 300) % 12 as tinyint)

    )

    END

    --END IF

    SET @PrevSessionId = @CurrSessionId

    SET @PrevSeconds = @CurrSeconds

    FETCH NEXT -- Get the first values into the @prev variables

    FROM SessionCursor

    INTO @CurrSessionId,

    @CurrSeconds

    END

    --END WHILE

    PRINT 'Cursor parsing finished ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs, '

    + Cast(@RowCount as varchar(10)) + ' rows parsed'

    CLOSE SessionCursor

    DEALLOCATE SessionCursor

    INSERT INTO dbo.[SiteVisits]

    (

    Date,

    [Hour],

    Fiveminute,

    [Visits],

    [Method]

    )

    SELECT @Date,

    [Hour],

    Fiveminute,

    Count(*),

    'JR'

    FROM #Table

    GROUP BY [Hour], Fiveminute

    PRINT 'SiteVisits Inserted ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    DROP TABLE #Table

    PRINT 'End ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    RETURN

    END

    Detailed Results

    Opening SessionCursor 0 secs

    Cursor parsing starting 1255 secs

    Cursor parsing finished 1775 secs, 17381996 rows parsed

    SiteVisits Inserted 1776 secs

    End 1776 secs

    What I notice here is that the cursor took longer to open than it did to parse through all the 17 million rows.

    Pseudo Cursor

    CREATE PROCEDURE [dbo].[AggregateSiteVisits2]

    /******************************************************************************

    Purpose:

    Returns the visit count for each 5 minute period of each hour for the given day.

    Programmer notes and references:

    1. A "visit" is defined as a SessionId that hasn't been seen for over 30 minutes on the web site. So the data is

    processed sorted by SessionId and Datetime and the time between each session visit is measured to see if it is a

    new session or the same one.

    2. This sproc uses a "pseudo cursor update" which is discussed in detail in the following article:

    Solving the "Running Total" & "Ordinal Rank" Problems in SS 2k/2k5

    By Jeff Moden, 2008/01/31

    http://qa.sqlservercentral.com/articles/Advanced+Querying/61716/

    Usage:

    EXEC dbo.GetDailyVisitCount @Date

    ... where @Date is anything that converts to a DATETIME data type.

    Revision History:

    Rev 00 - 03 Jan 2009 - Jeff Moden - Initial creation and test

    Reference: http://qa.sqlservercentral.com/Forums/Topic625172-1373-2.aspx

    *********************************************************************************/

    --===== Declare procedure I/O

    @pDate DATETIME

    AS

    --==================================================================================

    -- Test Harness... uncomment this section and run from here down for on screen testing.

    -- See the final Select in this proc for addition field outputs

    --==================================================================================

    --DECLARE @pDate DATETIME

    -- SET @pDate = '12/20/2008' --Must be a date that's actually in the dbo.WebLogEvents table at the time

    --==================================================================================

    -- Presets

    --==================================================================================

    --===== Declare and preset local variables

    -- Note... all of these variables are used with the "pseudo cursor update"

    -- to keep track of values from previous rows

    DECLARE @StartTime datetime

    SET @StartTime = GetDate()

    PRINT 'Start AggregateSiteVisits2 - Jeff Moden ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    DECLARE @PrevSessionID VARCHAR(50),

    @PrevKeepDate DATETIME,

    @PrevKeep TINYINT

    SELECT @PrevSessionID = '',

    @PrevKeepDate = 0,

    @PrevKeep = 0

    --===== Ensure the date parameter is a whole date

    SELECT @pDate = DATEADD(dd,DATEDIFF(dd,0,@pDate),0)

    --===== Supress the autodisplay of rowcounts so only the result set will be returned

    SET NOCOUNT ON

    --========================================================================================

    -- Copy data from the event table to a working table where we can work on it using a special index

    --=========================================================================================

    --===== Create and populate the working table on the fly

    SELECT CAST(SUBSTRING(Cookie,1,CHARINDEX(';',Cookie)-1) AS VARCHAR(50)) AS SessionID,

    TheDate+2.0 AS TheDate,

    CAST(0 AS TINYINT) AS KeepMe,

    Date AS DateOnly

    INTO #MyHead

    FROM (--==== Subquery does some of the calcs and isolates the start of the SessionID

    SELECT Date+Time AS TheDate,

    SUBSTRING([cs(Cookie)],CHARINDEX('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie,

    Date

    FROM dbo.WebLogEvents

    WHERE (Date = @pDate

    OR (Date = @pDate-1 AND Time >= '1899-12-30 23:30'))

    AND [cs(Cookie)] LIKE '%ASP.NET_SessionId=%'

    )p1

    PRINT '#MyHead Created ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + + ' secs, '

    + Cast(@@RowCount as varchar(10)) + ' rows inserted'

    --===== Add the necessary index to support the "pseudo cursor" update

    CREATE CLUSTERED INDEX IXC_#MyHead_SessionID_TheDate

    ON #MyHead (SessionID,TheDate) WITH FILLFACTOR = 100

    PRINT 'Index Created ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    --====================================================================================================

    -- Mark "visits" according to the rules in "Programmer's notes #1" above

    --====================================================================================================

    --===== Do the "pseudo cursor" update to determine which rows to keep

    UPDATE #MyHead

    SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID

    AND DATEDIFF(mi,@PrevKeepDate,TheDate)>30

    THEN 1

    WHEN SessionID <> @PrevSessionID

    THEN 1

    ELSE 0

    END,

    @PrevKeepDate = TheDate,

    @PrevSessionID = SessionID

    FROM #MyHead WITH(INDEX(0)) --Forces a clustered index scan to maintain the update order

    PRINT 'Update Completed ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10))

    + ' secs, ' + Cast(@@RowCount as varchar(10)) + ' rows updated'

    --======================================================================================

    -- Return the required output as a single result set

    --======================================================================================

    --===== Return the results

    INSERT INTO dbo.[SiteVisits]

    (

    Date,

    [Hour],

    Fiveminute,

    Visits,

    Method

    )

    SELECT @pDate AS Date,

    DATEPART(hh,TheDate) AS [Hour],

    DATEPART(mi,TheDate)/5 AS [FiveMinute],

    COUNT(*) AS TheCount,

    'JM'

    FROM #MyHead

    WHERE KeepMe = 1

    AND DateOnly = @pDate

    GROUP BY DATEPART(hh,TheDate),

    DATEPART(mi,TheDate)/5

    PRINT 'Results returned ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    --==============================================================================

    -- Test Harness... uncomment this section for on screen testing.

    --==============================================================================

    -- SELECT SessionID,

    -- TheDate,

    -- DateOnly,

    -- KeepMe,

    -- DATEPART(hh,TheDate) AS [Hour],

    -- DATEPART(mi,TheDate)/5 AS [FiveMinute]

    -- FROM #MyHead

    -- WHERE KeepMe = 1

    -- ORDER BY SessionID, TheDate

    Detailed Results

    Start AggregateSiteVisits2 - Jeff Moden 0 secs

    #MyHead Created 1113 secs, 17381996 rows inserted

    Index Created 1237 secs

    Update Completed 1641 secs, 17381996 rows updated

    Results returned 1653 secs

    Join table to itself

    CREATE PROCEDURE [dbo].[AggregateSiteVisits3]

    @Date datetime

    AS

    SET NOCOUNT ON

    DECLARE @StartTime datetime

    SET @StartTime = GetDate()

    PRINT 'Start AggregateSiteVisits3 - Matt Whitfield '

    + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    DECLARE @ASPString nvarchar(100)

    DECLARE @Len int

    DECLARE @DateMinus1 datetime

    DECLARE @ElevenThirty datetime

    SET @ElevenThirty = '1899-12-30 23:30:00.000'

    SET @ASPString = 'ASP.NET_SessionId='

    SET @Len = Len(@ASPString)

    SET @Date = Convert(varchar, @Date, 112)

    SET @DateMinus1 = DateAdd(dd, -1, @Date)

    CREATE TABLE #sessions(ID int IDENTITY(1,1) primary key clustered, EventMinuteIndex smallint, ASPSessionID varchar(50))

    PRINT 'Created Table #sessions ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs'

    INSERT INTO #sessions (EventMinuteIndex, ASPSessionID)

    SELECT EventMinuteIndex, ASPSessionID FROM

    (

    SELECT CAST(SUBSTRING(Cookie,1,CHARINDEX(';',Cookie)-1) AS VARCHAR(50)) AS ASPSessionID,

    datediff(minute, @date, TheDate+2.0) AS EventMinuteIndex

    FROM (--==== Subquery does some of the calcs and isolates the start of the SessionID

    SELECT Date+Time AS TheDate,

    SUBSTRING([cs(Cookie)],CHARINDEX('ASP.NET_SessionId=',[cs(Cookie)])+18,8000)+';' AS Cookie,

    Date

    FROM dbo.WebLogEvents WITH (NOLOCK)

    WHERE (Date = @Date

    OR (Date = @Date-1 AND Time >= '1899-12-30 23:30'))

    AND [cs(Cookie)] LIKE '%ASP.NET_SessionId=%'

    )p1

    ) itbl

    ORDER BY ASPSessionID, EventMinuteIndex

    PRINT 'Inserted #sessions ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs, '

    + Cast(@@RowCount as varchar(10)) + ' rows inserted'

    INSERT INTO dbo.[SiteVisits]

    (

    Date,

    [Hour],

    Fiveminute,

    Visits,

    Method

    )

    SELECT @date as Date, hr, fm, count(*), 'MW' FROM

    (

    select eventminuteindex / 60 as hr, (eventminuteindex / 5) % 12 as fm FROM

    (

    SELECT s1.eventminuteindex,

    CASE WHEN s1.eventminuteindex - s2.eventminuteindex > 30

    THEN 1

    ELSE 0

    END as enoughTimePassed,

    CASE WHEN s1.aspsessionid = s2.aspsessionid

    THEN 0

    ELSE 1

    END as SessionDifferent

    FROM #sessions s1 INNER JOIN

    #sessions s2

    on s1.ID = s2.ID + 1

    UNION ALL

    -- union the first row in, as the above join will never pick it up

    SELECT s1.eventminuteindex, 1 as enoughTimePassed, 1 as SessionDifferent

    FROM #sessions s1 where id = 1

    ) itbl

    WHERE EventMinuteIndex >= 0

    and (EnoughTimePassed + SessionDifferent) > 0

    ) otbl

    GROUP BY hr, fm

    PRINT 'Inserted SiteVisits ' + Cast(DateDiff(ss,@StartTime, GetDate()) as varchar(10)) + ' secs, '

    + Cast(@@RowCount as varchar(10)) + ' rows inserted'

    Detailed Results

    Start AggregateSiteVisits3 - Matt Whitfield 0 secs

    Created Table #sessions 0 secs

    Inserted #sessions 1305 secs, 17381996 rows inserted

    Inserted SiteVisits 1330 secs, 288 rows inserted

  • So, collectively we fixed a logic funny for you, and made it quicker to boot.

    Go team! 😛

    Ultimate props to Jeff on this one - I think the main key here was the 'divide and conquer' method of extracting the rows in the first place!

    Atlantis Interactive - SQL Server Tools
    My blog[/url]
    Why I wrote a sql query analyzer clone

  • Matt Whitfield (1/9/2009)


    So, collectively we fixed a logic funny for you, and made it quicker to boot.

    Go team! 😛

    Ultimate props to Jeff on this one - I think the main key here was the 'divide and conquer' method of extracting the rows in the first place!

    The main effect of splitting the ASPSessionId extraction into two parts ('divide and conquer') is a reduction in CPU usage.

    I'm not sure of the effect on IO as it might have to create an internal temporary table of the inner select and insert that the main temporary table. On a machine with a faster CPU and slower disk the complicated expression might be faster and on a machine with a slower CPU and faster disk the 'divide and conquer' might be quicker.

  • Matt Whitfield (1/9/2009)


    So, collectively we fixed a logic funny for you, and made it quicker to boot.

    Go team! 😛

    "Go Team" is right!

    If I may, please consider what just happened here... we went from a "can't possibly be done without a cursor because..." problem, to a couple of different set based solutions, one of which was about a 30% improvement in performance... and those were done with a relatively small number of test rows compared to the larger problem and no huge effort towards index tuning was done. Was the cursor a bad thing for this solution? Eh, not really... it actually did pretty well, all things considered. The key is that, especially in SQL Server 2k5 and above, there is nothing that really needs a cursor and a 30% improvement is probably worth going for. Certainly, we learned more of what is possible and a couple of other tricks just by taking the time to science out the set base solutions.

    Well done one and all! JacRoberts, thank you for your testing and for letting us have at it with your data. 🙂

    Remember... if you're gonna practice the piano, no sense in intentionally hitting the wrong keys. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (1/9/2009)


    Remember... if you're gonna practice the piano, no sense in intentionally hitting the wrong keys. 😀

    Unless you are one of these:

    http://www.youtube.com/watch?v=9nNGlaiVypU

    http://www.youtube.com/watch?v=XooTrS440Us

  • jacroberts (1/9/2009)


    Jeff Moden (1/9/2009)


    Remember... if you're gonna practice the piano, no sense in intentionally hitting the wrong keys. 😀

    Unless you are one of these:

    http://www.youtube.com/watch?v=9nNGlaiVypU

    http://www.youtube.com/watch?v=XooTrS440Us

    LMAO!!! Oh My! That second one reminds me of the following sequence...

    1. Managers get together with business analysts to figure out what they're going to do. Of course, they take the wrong steps and they take too long to do it.

    2. Project starts, but someone forgot to tell the developer they started.

    3. Developer gets ready to start but has no visibility with management so doesn't start on time.

    4. Developer finally gets started, writes a cursor, and then fights with management to justify it.

    5. DBA try's to show how to write it. Developer fights with DBA to justify the cursor.

    6. Management acquiesces to keep the show rolling.

    7. Everybody on the management team applauds because they don't know what the hell is going on… they're just happy that something happened.

    😛

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (1/3/2009)


    OK... here we go. --===== Do the "pseudo cursor" update to determine which rows to keep

    =====================================================================================================================

    UPDATE #MyHead

    SET @PrevKeep = KeepMe = CASE WHEN SessionID = @PrevSessionID

    AND DATEDIFF(mi,@PrevKeepDate,TheDate)>30

    THEN 1

    WHEN SessionID <> @PrevSessionID

    THEN 1

    ELSE 0

    END,

    @PrevKeepDate = TheDate,

    @PrevSessionID = SessionID

    FROM #MyHead WITH(INDEX(0)) --Forces a clustered index scan to maintain the update order

    =====================================================================================================================

    Jeff,

    First of all, I am VERY impressed with how fast you actually created the result (not how fast it run... how fast you did the coding). Of course, as you mentioned in a later post, most of this procedure is remarks...

    Secondly, can you please explain WTH is going on with this "pseudo cursor" / "quirky update"??? It's not until we can understand it that we can truly learn how to utilize it ourself.

    Thanks,

    Wayne

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • Hi Wayne,

    I'm not sure what you mean by "how fast you did the coding". Lots of folks posted and I was just one of them... In fact, I thought I posted later than most of the folks. But, thank you for the nice compliment.

    For the "quirky update"... I'm actually in the process (or at least, trying to get there) or rewritting the article on it. It's a pretty long article explaining a lot of the "gazintas" to use it, but here's the super short version.

    Imagine a spreadsheet for a checkbook and it's sorted in the correct order. In each row, you have a formula that takes the previous row's running balance, add or subtracts the current transaction and puts the answer on the current transaction row. You can also probably imagine that as a file that you're looping through. Read a row, process the row, "remember" a couple of key things about that row, move to read the next row, repeat until done.

    The "quirky update" (called a "pseudo-cursor" because it processes one row at a time without an explicit cursor) works exactly the same way. In order to "remember" stuff from the previous row, it uses the form of UPDATE table SET @var=col=expression. That evaluates the expression, updates the column, and remembers the result of that expression in one tight little line. Behind the scenes closer to the machine language level, UPDATE is looping through the rows just like an explicit cursor but much, much faster.

    The obvious problem with all of that is the order of the data. There's no ORDER BY that you can use with an UPDATE and even if you use a derived table (subquery that's used like a table) that has a TOP 100 PERCENT and an ORDER BY, it won't work. Both the "problem" and the solution is that if the clustered index goes through a scan (and it usually does, but I force it with an index hint), the order of the rows processed by the update will be in the same order as that of the clustered index even if there are thousands of page splits.

    The reason why I'm rewritting the article is because I made a mistake... I said that same technique would work on SELECTs and, patently, it doesn't. At least in SQL Server 2000 it doesn't. It does seem to work in 2k5, but I'm not ready to trust it.

    By the way... if you ask around, the other MVP's want to kill Phil and I for this method. They insist there is no guarantee that it works but, as Phil Factor (who's also written about the "quirky update") will attest, it's worked since T-SQL was known as "Sybase" and, when some simple rules are followed, no one has been able to break it yet. And yes, the "UPDATE table SET @var=col=expression" is documented in Books Online under UPDATE... just not the "quirky update". Even the folks at Microsoft say "no guarantee"... but, again, no one has ever shown me (or Phil, I believe) any code where it breaks.

    If someone is worried about it breaking and they don't want to put any checks in place, then they should use a cursor or while loop to do it... it's stable and, although it takes a lot longer, it is the second fastest method and it will accept an ORDER BY.

    By the way... on my almost 7 year old desktop box, the "quirky update" will do a million row running balance in less than 7 seconds. Try that with a cursor or while loop. And, it has a ton of other uses like doing grouped ranking (like ROW_NUMBER() OVER in 2k5) in SQL Server 2000, data "smears" (carrying data forward from the previous row), and a bunch more.

    Of course, the final caveat is that people claim that a CLR will do the trick even faster than that. I've not used VB in about 7 years and I've never used "C", so I'll just take their word for it.

    I sure hope that answers your question... I didn't want to write the whole article here. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Would like to replace cursor with while loop in the following stored procedure or at least compare

    performance with while loop. Can someone supply me with while loop non-cursor T-SQL statement ?

    /*

    In table dbo.tmpTable1, LocationDateTime is polulated with data and some fields have

    EventType IN (1) and some fields have EventType IN (0).

    The purpose of this stored procedure is to

    polulate LocationDateTime column with data (date) where the EventType IN (1),

    polulate EndLocationDateTime column with data (date) where the EventType IN (0),

    from table dbo.tmpTable1.

    */

    CREATE PROCEDURE [dbo].[proc_xx]

    @iSubsID int,

    @dStartDate datetime = Null,

    @dEndDate datetime = Null,

    @iTimeOffset int = 120,

    @iUserID int

    AS

    SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

    SET NOCOUNT ON

    DECLARE @tmpTable TABLE

    (

    ItemId VARCHAR(20),

    LocationDateTime DATETIME,

    EndLocationDateTime DATETIME

    )

    DECLARE @lUserID INT,

    @ldtStartDate DATETIME,

    @ldtEndDate DATETIME,

    @liSubsID INT,

    @liTimeOffset INT,

    @ItemId VARCHAR(20),

    @LocationDateTime DATETIME,

    @EventType INT,

    @lvcItemId VARCHAR(9)

    SET @lUserID = @iUserID

    SET @ldtStartDate = ISNULL(@dStartDate,DATEADD(month,-1,DATEADD(mm, DATEDIFF(mm,0,GETDATE()), 0)))

    SET @ldtEndDate = ISNULL(@dEndDate,DATEADD(ms, -3, DATEADD(mm, DATEDIFF(mm, 0, GETDATE()), 0)))

    SET @liTimeOffset = @iTimeOffset

    IF @iSubsID = -1

    BEGIN

    SET @liSubsID = NULL

    END

    ELSE

    BEGIN

    SET @liSubsID = @iSubsID

    END

    DECLARE Travel_cursor CURSOR

    FOR

    SELECT

    ItemId,

    LocationDatetime,

    EventType

    FROM

    dbo.tmpTable1

    ORDER BY

    LocationDatetime

    OPEN Travel_cursor

    FETCH NEXT FROM Travel_cursor INTO

    @ItemId,

    @LocationDateTime,

    @EventType

    WHILE @@FETCH_STATUS = 0

    BEGIN

    IF @EventType IN (1)

    BEGIN

    INSERT INTO @tmpTable

    (

    ItemId,

    LocationDateTime,

    EndLocationDateTime

    )

    VALUES

    (

    @ItemId,

    @LocationDateTime,

    NULL

    )

    END

    ELSE

    BEGIN -- This part updates EndLocationDateTime

    UPDATE

    @tmpTable

    SET

    EndLocationDateTime = @LocationDateTime

    WHERE

    ItemId = @ItemId

    AND EndLocationDateTime IS NULL

    AND LocationDateTime < @LocationDateTime

    END

    FETCH NEXT FROM Travel_cursor INTO -- This part supplies value for @LocationDateTime

    @ItemId,

    @LocationDateTime,

    @EventType

    END

    SELECT * FROM @tmpTable

    CLOSE Travel_cursor

    DEALLOCATE Travel_cursor

    /*

    exec proc_xx 47579,'2009/11/01','2009/11/30 23:59:59',120,16899

    */

    GO

  • I have two recommendations on that, Clive... first, post your question as a new post on the T-SQL forum because it'll will like get a better response there instead of as a single post on an article thread (like this one) might get.

    Second, if all you want to do is replace a Cursor with a While Loop, then don't bother because you won't gain a thing in perfermance compared to a Read Only Forward Only Cursor.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • 1) while loop is not a replacement for cursor. both are row-by-row processing.

    2) if you ARE going to use a cursor, please declare it either FAST_FORWARD or READ_ONLY STATIC LOCAL FORWARD_ONLY (although some of those are redundant they don't urt). Hugo Kornelius did a wonderful set of benchmarks to determine various cursor declaration performance.

    3) neither RBAR method is required for what you seek to do. I am almost 100% certain that this can be backed up to the insert that populates your temp table, in which case this entire sproc as well as the temp table would be unnecessary. however, here is some code to get you started on refactoring your sproc. you may desire a min for start. also, the max stuff can be avoided entirely if you can guarantee only one record of each type for each itemid.

    create table #temp (itemid smallint, locationtime datetime, eventtype bit)

    insert #temp values (1, '2/2/2009', 1)

    insert #temp values (1, '2/4/2009', 0)

    insert #temp values (1, '2/5/2009', 0)

    insert #temp values (2, '3/3/2009', 1)

    insert #temp values (2, '4/4/2009', 0)

    select itemid

    ,max(case eventtype when 1 then locationtime else null end) as starttime

    ,max(case eventtype when 0 then locationtime else null end) as endtime

    from #temp

    group by itemid

    I would imagine this method will be 1 to 2 orders of magnitude more efficient than cursor/while loop stuff. 😉

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Jeff Moden (12/8/2009)


    I have two recommendations on that, Clive... first, post your question as a new post on the T-SQL forum because it'll will like get a better response there instead of as a single post on an article thread (like this one) might get.

    Second, if all you want to do is replace a Cursor with a While Loop, then don't bother because you won't gain a thing in perfermance compared to a Read Only Forward Only Cursor.

    I think my post is all the OP should need. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Heh... I agree, Kevin... about cursors and while that's a very good thing, the op learns nothing about the other faults in the code especially the 23:59:59 thing. 😉

    Clive, I still recommend that you post this where more people can see it... That's just one excellent post out of a dozen that you'll likely get if you post it on the T-SQL forum. Make sure you let folks know which version of SQL Server you're using, was well (normally done by posting to the correct forum).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hi Jeff, posted as suggested.

    Post #831649.

    Topic: Replace cursor with while loop / compare cursor with while loop / optimize...

  • clive-421796 (12/9/2009)


    Hi Jeff, posted as suggested.

    Post #831649.

    Topic: Replace cursor with while loop / compare cursor with while loop / optimize...

    I seem to be unable to locate said post. Anyone put a direct link here? not sure of the mechanism to use that post number.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 15 posts - 211 through 225 (of 272 total)

You must be logged in to reply to this topic. Login to reply