Eliminating Cursors

  • samir (6/16/2008)


    Using this method, the SELECT query is executed 'n' number of times, where n is total number of records in the table. I dont think this is optimized solution. Instead we can use co-related queries or construct dynamic strings using single SELECT statement or a CROSS APPLY to avoid the use of cursors.

    My first reaction was - what "this method"? In a thread this big - it's hard to know what you're commenting on.

    That being said - correlated subs and/or CROSS APPLY (or OUTER APPLY) with a correlated TVF force the same row by row execution.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • I agree with Bob. Why write some unreadable code in order to avoid cursors? To my opinion, cursors are easy to implement, very straightforward and the extra cost of performance is nothing compared to the whole batchprocess in which it's being used.

    Tip: If you just use a cursor for looping through a table use this:

    declare c_cursorname cursor local fast_forward for

    select ...

    from ...

    where ...

    This will speed-up your cursor

    Wilfred
    The best things in life are the simple things

  • Wilfred van Dijk (6/16/2008)


    Why write some unreadable code in order to avoid cursors?

    Why do you think it must be unreadable?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Wilfred van Dijk (6/16/2008)


    I agree with Bob. Why write some unreadable code in order to avoid cursors? To my opinion, cursors are easy to implement, very straightforward and the extra cost of performance is nothing compared to the whole batchprocess in which it's being used.

    Tip: If you just use a cursor for looping through a table use this:

    declare c_cursorname cursor local fast_forward for

    select ...

    from ...

    where ...

    This will speed-up your cursor

    Wilfred, if you go back through the thread I think you will find a comment from me (maybe 2 of them) that this is NOT the fastest cursor. FORWARD_ONLY READ_ONLY STATIC LOCAL is the most performant. See here: http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (6/16/2008)


    Wilfred van Dijk (6/16/2008)


    I agree with Bob. Why write some unreadable code in order to avoid cursors? To my opinion, cursors are easy to implement, very straightforward and the extra cost of performance is nothing compared to the whole batchprocess in which it's being used.

    Tip: If you just use a cursor for looping through a table use this:

    declare c_cursorname cursor local fast_forward for

    select ...

    from ...

    where ...

    This will speed-up your cursor

    Wilfred, if you go back through the thread I think you will find a comment from me (maybe 2 of them) that this is NOT the fastest cursor. FORWARD_ONLY READ_ONLY STATIC LOCAL is the most performant. See here: http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx%5B/quote%5D

    As the testing I did on Saturday also confirms. Testing script and results listed a page or two back.

    Keep in mind that from my experience - this kind of operation happens to be one of the LEAST "bad" times to use a cursor, since it simply involves reading data. Anything involving data to be posted back into a table, and the serial options will fall flat and start choking on even rather small sets.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Just to make sure everyone understands here, LEAST bad in this case still means greater than 1 ORDER OF MAGNITUDE greater runtime! The fastest TSQL cursor took 13.917 seconds on average while the set-based solution was .693 seconds. Yeah, that isn't the end of the world (assuming you don't need < 1 sec response time), but the aggregate overhead from the pervasive use of cursors can really crush a server's overall performance.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Here is yet another reason to avoid cursors altogether. I just got a link to this knowledge base article (thank you KBALERTZ).

    ATBCharles Kincaid

  • Not sure you can use that as a basis for avoiding cursors though. There have been many more bugs fixed (and likely many more exist) where set-based queries are processed slower in 2005 than they were in 2000. It is incredibly difficult (actually I think impossible) to do complete regression testing for something as complex as a query optimizer rewrite. 🙂

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Just did some testing with a 300,000+ record table and following some advice from past post concerning cursor optimization, I was able to achieve the following stats:

    CURSOR (LOCAL FAST_FORWARD) : 18sec.

    TABLE (with PRIMARY KEY set for identity field) 36sec.

    The tests were performed on SQL 2000.

    Besides the fact that the cursor solution is faster for the tests I performed, I do not see any advantage or disadvantage in using one solution or the other. In both cases, TSQL code were as readable as the other. I am just curious how the memory of the server is doing under the table solution with regards to a 300,000+ records load.

    Pierre Boucher

  • Besides the fact that the cursor solution is faster for the tests I performed, I do not see any advantage or disadvantage in using one solution or the other. In both cases, TSQL code were as readable as the other. I am just curious how the memory of the server is doing under the table solution with regards to a 300,000+ records load.

    I am going to play a bit of devils advocate from all sides. I may get some evil glares when I say this.

    Code Readability vs Performance is a situational thing. I am going to be short and blunt as I don't have a lot of time today.

    Some here are arguing on the side of readability and because of that they feel the cursor is better. I will very very hesitantly agree on code readability in the case of one offs where you are running a process just once or once in a while and performance is a non issue. We can let that one go.

    On the performance side, in a few cases, the code can get a little more complicated to read. If performance is going to be the issue I have to say so what and who cares how readable it is. 1 second here and there can easily compound itself fast with even as few as 10 concurrent users. Now introduce 1000s of customers an hour and you will end up with a site that fails in usability.

    All the tests we have shown here are single user tests. What happens when you introduce multiple concurrent users will be significantly worse.

    Pierre, I know I quoted your comment, I just want you to know this next comment is not directed at you. It is an overall generalization.

    To use code readability as an excuse over improved performance where it is needed to me is frankly a cop out. Properly document what you have done and you should have little to no problem maintaining your code. I know this may offend some and I am sorry if you take offense.

  • I totally agree with Kevin on the "Code Readability vs Performance" issue.

    Readability should not be the sole selection criteria for a solution vs another. Performance of code that runs live is much more important. At equal performance, I would pick the easiest to maintain. In any case, I would document it.

    My comment about the readability of both solutions was intended for pure code comparison. In this specific case, if performance were equal, I really do not know which one I would implement. I believe I wouldn't have any trouble maintaining either solutions.

    Hopefully, no one has been offended by Kevin's comment about documenting code properly. I was not. In fact, its good to be reminded once in a while. Thanks Kevin! 😉

    Pierre Boucher

  • Pierre Boucher (6/16/2008)


    Just did some testing with a 300,000+ record table and following some advice from past post concerning cursor optimization, I was able to achieve the following stats:

    CURSOR (LOCAL FAST_FORWARD) : 18sec.

    TABLE (with PRIMARY KEY set for identity field) 36sec.

    The tests were performed on SQL 2000.

    Pierre... this is a pretty long thread... what were the tests you did so we can try on our own?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Those tests consists of calculating the running total for one field in the table. Both versions does exactly the same thing.

    This is the cursor test ...

    [font="Courier New"]DECLARE @StartTime datetime

    DECLARE @StopTime datetime

    DECLARE @tworkhrs decimal(16,2),

    @total_tworkhrs decimal(16,2)

    DECLARE item_cursor CURSOR LOCAL FAST_FORWARD FOR

    SELECT tworkhrs

    FROM dbo.timecard

    SELECT @StartTime = GETDATE()

    OPEN item_cursor

    FETCH NEXT FROM item_cursor INTO @tworkhrs

    WHILE @@FETCH_STATUS = 0

    BEGIN

    SELECT @total_tworkhrs = @total_tworkhrs + ISNULL(@tworkhrs, 0)

    FETCH NEXT FROM item_cursor INTO @tworkhrs

    END

    SELECT @StopTime = GETDATE()

    SELECT DATEDIFF(second, @StartTime, @StopTime)

    [/font]

    ... and the table test

    [font="Courier New"]DECLARE @item_table TABLE (primary_key INT IDENTITY(1,1) NOT NULL PRIMARY KEY, tworkhrs decimal(16,2))

    DECLARE @item_category_counter INT

    DECLARE @loop_counter INT

    SELECT @StartTime = GETDATE()

    INSERT INTO @item_table (tworkhrs)

    SELECT tworkhrs

    FROM dbo.timecard

    SET @loop_counter = @@ROWCOUNT

    SET @item_category_counter = 1

    WHILE @loop_counter > 0 AND @item_category_counter <= @loop_counter

    BEGIN

    SELECT @tworkhrs = tworkhrs

    FROM @item_table

    WHERE primary_key = @item_category_counter

    SELECT @total_tworkhrs = @total_tworkhrs + ISNULL(@tworkhrs, 0)

    SET @item_category_counter = @item_category_counter + 1

    END

    SELECT @StopTime = GETDATE()

    SELECT DATEDIFF(second, @StartTime, @StopTime)

    [/font]

    I selected one of the tables we have in our development environment because of the number of records it contains (309,658). I marked in bold the differences with the original post, except for my specific code.

    Have fun!

    Pierre Boucher

  • Just a quick comment about the table based.

    Move @loop_counter out of the while into an IF statement outside the loop. Though its minimal, there is no point in running the @loop_counter > 0 comparison on every record when the variables value never changes and only needs to be checked once.

    Should look like this:

    IF @loop_counter > 0

    BEGIN

    WHILE @item_category_counter <= @loop_counter

    BEGIN

  • Pierre Boucher (6/16/2008)


    Those tests consists of calculating the running total for one field in the table. Both versions does exactly the same thing.

    Right up my alley... no need for any kind of RBAR for running totals... see the following...

    http://qa.sqlservercentral.com/articles/Advanced+Querying/61716/

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 196 through 210 (of 296 total)

You must be logged in to reply to this topic. Login to reply