There Must Be 15 Ways To Lose Your Cursors... part 1, Introduction

  • Wow, just catching up on the thread.....surprisingly I have a "real world job to do" 🙂

    Thanks Matt for your posts and GSquared....what can I say....that's some pretty harsh things you're saying about people and situations you know nothing about....maybe some anger management classes or something.....

    I'm all for using set based solutions. I'm no fan of cursors. I posted a problem "I inherited" that uses a cursor and I can think of no way to change it to set based and was hoping others might have an idea. The product is installed in over 100 sites and a major schema change to eliminate this one particular cursor solution just isn't in the cards at the moment. And gee guess what, I've been doing this for literally "decades" also, 26 years to be exact and I have a pretty good handle on SQL at this point, though still learning every day.;-)

    Here's the point I was trying to make. Some people work in an environment where there are time constraints and other issues completely out of their control, so wringing every last millisecond out of every last delete/update/insert/select would take more time than they have. So they optimize the critical functionality in order to deliver on time, that's all. The non-critical stuff is written with good SQL just not tweaked to the nth degree.

    I think everyone can agree there is more than one way to code a solution! Heck just look at the number of different solutions folks have come up with for getting rid of the simple cursor put forth in the article! How many of you have written the perfect, fastest version of a solution on the first try..." other than GSquared of course who gets it right the first time all the time!" Sorry couldn't resist, my bad! How many of you have revisited a solution several times and tweaked it here and there maybe even rewritten it...hmmm I think that's called refactoring! The point is they're not bad developers, ignorant of proper SQL or lazy, they just have to prioritize what needs to be optimized.

  • I cannot speak for the others, but I'm not talking about code optimized to the nth degree... I'm talking about code that has been intentionally made slow because the developer actually was ignorant of proper SQL and that normally occurs because they're either brand spanking new, or because they never studied SQL enough to find out the proper SQL. The latter part of that has all of the implications that the others are saying out loud instead of just implying.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • bruce.trimpop (4/24/2009)


    Here's the point I was trying to make. Some people work in an environment where there are time constraints and other issues completely out of their control, so wringing every last millisecond out of every last delete/update/insert/select would take more time than they have. So they optimize the critical functionality in order to deliver on time, that's all. The non-critical stuff is written with good SQL just not tweaked to the nth degree.

    ...they just have to prioritize what needs to be optimized.

    Everybody has to work within time constraints and choose their priorities.

    Many people don't optimize their queries because they are in a rush to get a project out and "it's just one stored procedure," or "it won't get hit too often."

    Well, guess what, all those "one stored procedure"s add up, your volumes increase so that "won't get hit" procedures are being run all the time -- and then you have massive performance problems. And let me tell you, it's not pretty. Now there are so many stored procedures that need to be optimized and it is practically impossible to fix all of them.

    Sometimes taking out the extra time taken now saves you in the long run.

    I'm not talking about optimizing to the "nth degree" as you say, I'm talking about making a reasonable effort.

    I know many people who don't even bother at all.

  • Jeff Moden (4/24/2009)


    ..... I'm talking about code that has been intentionally made slow....

    That would seem to imply something malicious on the developers part "intentionally made slow" ?

    That aside, as others have stated, if you are new to SQL and your company has you coding complicated tasks in critical applications then ya shame on that company and they probably won't be in business very long. If someone knows SQL reasonably well and they are "intentionally" writing bad code then they need to be fired. So bottom line, we're saying the same thing I think, just from a different angle.

  • Goldie Graber (4/24/2009)


    bruce.trimpop (4/24/2009)


    Here's the point I was trying to make. Some people work in an environment where there are time constraints and other issues completely out of their control, so wringing every last millisecond out of every last delete/update/insert/select would take more time than they have. So they optimize the critical functionality in order to deliver on time, that's all. The non-critical stuff is written with good SQL just not tweaked to the nth degree.

    ...they just have to prioritize what needs to be optimized.

    Everybody has to work within time constraints and choose their priorities.

    Many people don't optimize their queries because they are in a rush to get a project out and "it's just one stored procedure," or "it won't get hit too often."

    Well, guess what, all those "one stored procedure"s add up, your volumes increase so that "won't get hit" procedures are being run all the time -- and then you have massive performance problems. And let me tell you, it's not pretty. Now there are so many stored procedures that need to be optimized and it is practically impossible to fix all of them.

    Sometimes taking out the extra time taken now saves you in the long run.

    I'm not talking about optimizing to the "nth degree" as you say, I'm talking about making a reasonable effort.

    I know many people who don't even bother at all.

    I'm sorry you know many people who don't even bother at all. Fortunately, I don't. Sounds like they need to be fired. And I agree with you. If you have people like that then yes you can end up with a real mess down the road. As far as taking the time...I sometimes don't have that luxury. I work under contracted legal obligations and deadlines that have to be met if I want to get paid so prioritizing functionality is an important issue with me. (oh, and they are not my contracts. They are contracts negotiated with the customer by the company I work for...just to forestall the argument that I should build more time into the contracts!:-))

    By the way can't wait for part 2 !!! 😉

  • bruce.trimpop (4/24/2009)


    Jeff Moden (4/24/2009)


    ..... I'm talking about code that has been intentionally made slow....

    That would seem to imply something malicious on the developers part "intentionally made slow" ?

    That aside, as others have stated, if you are new to SQL and your company has you coding complicated tasks in critical applications then ya shame on that company and they probably won't be in business very long. If someone knows SQL reasonably well and they are "intentionally" writing bad code then they need to be fired. So bottom line, we're saying the same thing I think, just from a different angle.

    Heh... correct... you and I actually agree, just from different angles. If you write a cursor/While Loop knowing full well that there's a better way, then (IMHO) it is "malicious" (or, maybe, lazy) and, I agree, they need to be fired (or, better yet, educated). And I absolutely agree... shame on the company that gives a newbie a critcal task to write code for without the benefit of a mentor.

    I realized a long time ago, that a developer under schedule pressure will usually do anything to get the job done quickly and that is sometimes detrimental to data safety, performance, and/or scalability. They key is "education" because, first, you have to be taught or realize that there is a better way. If developers can be shown how to quickly think of methods to avoid cursors for given tasks, the time spent doing that will be well worth it to the company over the long haul.

    Part of that education comes from articles like this one.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (4/24/2009)


    bruce.trimpop (4/24/2009)


    Jeff Moden (4/24/2009)


    ..... I'm talking about code that has been intentionally made slow....

    That would seem to imply something malicious on the developers part "intentionally made slow" ?

    That aside, as others have stated, if you are new to SQL and your company has you coding complicated tasks in critical applications then ya shame on that company and they probably won't be in business very long. If someone knows SQL reasonably well and they are "intentionally" writing bad code then they need to be fired. So bottom line, we're saying the same thing I think, just from a different angle.

    Part of that education comes from articles like this one.

    Amen! Oops, need to leave religion out of "public" education !:-D

  • gautamsheth2000 (4/13/2009)


    Should use this code 🙂

    Select count(*) From master.sys.columns C1 CROSS JOIN master.sys.columns C2

    That's far too slow 😉

    Try instead:

    select power(count(*),2) from master.sys.columns

    Comparing equivalent code on SQL Server 2000 indicates that this takes about 25% of the time your cross join method takes, if there are about 5000 rows in the columns table.

    Tom

  • dbishop (4/14/2009)


    That is the point I was making. The author assumed that by changing his code from the individual steps (procedures) to Cookies, with chocolate chips, all of a sudden it was no longer procedural. I was trying to point out the falisy of that statement.

    The trouble with this statement is that Barry made it absolutely clear which sense of the word "declarative" he was using. There are numerous different definitions of "declarative", but Barry was very clear: declarative in his sense means specifying the result you want (chocolate chip cookies) rather than specifying a method of achieving it (make the mix, bake it). You've fallen into the trap of saying that "the desired result is cookies with chocolate chips" means the same as "make a cookie mix and bake it to get cookies with chocolate chips" but they are clearly different: the former leaves you the choice between buying a bag of cookies, cooking your own cookies, or commissioning someone else to cook them for you (and a few other methods, of course); the latter requires you to cook them, not buy them or get someone else to make them. So the former is not at all procedural - how can it be, when it doesn't constrain the procedure for achieving the end result, it only constrains the end result?

    However, I don't completely agree that SQL is actually declarative in Barry's sense, although a very large subset of it is. There are things that I can only express by an ordered sequence of statements - for example if I want to update several tables (unless there are some new language extensions I haven't heard about yet) - particularly if later updates depend on some characteristic of the earlier ones (total number of rows affected so far, for example). I do agree that cursors have only two functions: coping with the fact the 8000 is just too small (but that is fixed in SQL 2005) when I don't have a suitable a clustered unique index to avoid the cursor with, and chopping a job into smaller chunks (to allow other things a look-in) when again I haven't got a suitable index to do it with (doesn't need to be unique in this case, of course).

    Something that SQL could usefully acquire from other declarative languages is a MAP operator (to map a stored procedure or an extended stored procedure over a table where each row represents a set of parameters for it, with no defined order in the absence of an order by clause). This would be much cleaner than the current game of building a string of commands and using exec SQL on it, and on cases where the order clause was absent it would allow parallelism that isn't achievable with exec(SQL) (since teh execution of a series of SQL statements is serial, although parallelism can happen within an individual statement). It might also be good to pick up the REDUCE operator from the same school - seems rather better than the current "select @v = expr involving @v from ..." way of doing reduce, and may add extra flexibility (and why does @v in that have to be a scalar variable? perhaps it's because tables are no "first class objects" in SQL?).

    Tom

  • Girish Bhat (4/15/2009)


    Hi Thomas,

    Your point is well taken. For the specific example you have mentioned, you could use:

    sp_MSforeachtable

    @command1 = "Alter Table ? Alter Column Foo nvarchar(10)",

    @whereand = "and OBJECT_ID in (select object_ID from sys.columns where name = 'Foo')"

    sp_MSforeachtable is pretty awful, it calls sp_MSforeach_worker internally to use the cursor it declares; generally you will get far more efficient code by declaring your own cursor and using it instead of letting MS do it for you. Probably you will be more efficient still using dynamic SQL to avoid a cursor altogether.

    Tom

  • Tom.Thomson (4/29/2009)


    gautamsheth2000 (4/13/2009)


    Should use this code 🙂

    Select count(*) From master.sys.columns C1 CROSS JOIN master.sys.columns C2

    That's far too slow 😉

    Try instead:

    select power(count(*),2) from master.sys.columns

    Comparing equivalent code on SQL Server 2000 indicates that this takes about 25% of the time your cross join method takes, if there are about 5000 rows in the columns table.

    Nice one !

  • gserdijn (4/14/2009)


    select square(count(1)) From master.sys.columns C1

    Oops! I had missed that one.

    Nice !

  • RBarryYoung (4/15/2009)


    Just because pure set-based SQL may not be sufficient for a particular task does not mean that that justifies Cursors or loops in SQL. There are still a lot of miles between the two and a lot of other options that are not even on the same scale.

    Often the best solution is a loop - in fact I can't see a way of doing batching/chunking with delays between batches without using a loop. I usually express my loop as a schedule in MSDB, and I usualy use SQL to create that schedule rather than going through the job creation wizard, since the job may have to be created at every customer site and running some SQL is a lot less time consuming than using the wizard. Would it be reasonable to claim that I'm not writing loops in SQL? Maybe, but a loop is what I want for my solution and I'm telling the SQL system what loop I want, so if SQL as a declarative language that's writing a loop in SQL.

    I'm being a bit pedantic here, I know, but I,m doing it because think your over-strong denial of the usefulness of loops detracts from your message.

    Now if instead of loops you had said "while loops" that would be a different story, of course (I use those too sometimes, in SQL 2000, but only because 8000 isn't big enough no need in SQL 2005 and onwards).

    Tom

  • WayneS (4/18/2009)


    I just thought of something that I wish I had included with my original post.

    Programming languages have evolved. 15 years ago, you had plain old programming languages, complete with procedural ways of doing things. Then came object-oriented programming languages. To be a good OOP programmer, one had to start thinking of things as objects, and you had to deal with OOP principals, such as events, properties, encapusaltion and polymorphism. This thought shift did not occur overnight - you had to immerse yourself in it and work at it. Not all programmers could make this shift, and for others it took a long time before the light clicked on behind their eyes and they understood it.

    Off topic, bu IO can't let such basic nonsense pass.

    15 years ago is 1994. I could just mention that that's 10 years after Cardelli's paper on "The Semantics of Multiple Inheritance" but that doesn't really bring out the enormity of the ignorance of computer languages dispayed in the quoted text.

    Object Oriented languages are a bit older than 15 years : Simula(1962), Simula I (1965), Smalltalk(1971) are obvious examples. Languages allowing no visibility of the internal implementation of a type (ABstract Data Type languages, Algebraic Datatype languages) abounded in the 1960s (and this characteristic - often expressed as communication between objects only bt message passing - is the basis of the OO paradigm). Even the language most often (mis-)called Object Oriented today (C++) dates from 1982, a dozen tears before 1994, and Soundstrup's "C with Classes" upon which it was based was 1979. Besides, are functional languages (Lisp, 1958), logic languages (Prolog,1972), Actor languages (1970s), Process languages (CCS, 1980 - unless you count the 1978 paper-only version of CSP), Temporal Logic languages (before 1987 - i was using one then), the pi calculus (1989: at least half a dozen successor languages exist and are in use today, and I think this is on a rather different level from proces languages bilke CCS and CSP), mark-up languages (HTML -1991 - was based on SQML which in turn had it's roots in GML - early 60s).

    So how many of all these object oriented languages and declarative languages and so on do you want to say are "plain old programming languages, complete with procedural ways of doing things"? It's true of "C with classes" and C++ of course, but I don't think it's true of any of the others.

    Tom

  • Well stated... but doesn't that show that embracing OOP took even longer for most folks? 😉 I believe that was the real point trying to be made. :hehe:

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 316 through 330 (of 380 total)

You must be logged in to reply to this topic. Login to reply