T-Sql rant

  • Bob Hovious (3/18/2009)


    Jeff, I warned Fox he would hear about triangular joins one day.

    Let us get the man weaned off cursors first !! 😉

    The method I've been suggesting to generate sequence numbers is a triangular join and above a few thousand records (on my machine) performance falls off badly. Below 1000 records, it easily beats a cursor (but doesn't come close to using a temp table with IDENTITY or row_number() which are unbeleivably fast [less than half a second for 50000 records and only a couple of millisecond of cpu time).

    Since I use SS2k5, my preferred solution, if I had to generate sequence numbers, would be to use row_number() as it's faster and doesn't have the io overhead. On SS2K, where that's not available, I'd probably use a temp table with an IDENTITY column as the performance is only slightly worse.

    Of course, I'm with Lynn in wanting to know the business reason why you need sequence numbers and why you explicitly said you didn't want to use IDENTITY.

    Derek

  • Fox, let us in on something. Are you having to do a conversion of some existing code? Was all of this pregenerated for you, so that now you are trying to improve it's performance?

    The reason everyone is asking you about the sequence numbers is that most of us just don't use them. Your questions indicate that you are trying to solve a problem by forcing records to be in a particular sequence. In real life, that sequence is usually dictated by other items in your database, so we never worry about sequencing until time for output (if then).

    The closest real life example I can think of where a user could assign an arbitrary number at their whim was a situation where the user could set a priority. The priorities were 1, 2, or 3. But there was/is always a tiebreaker. If not, who cares if there are multiple rows with a 2 in that column?

    In real life we usually put a dateCreated (and often a dateLastUpdated) column in our major tables. In my example, when we pulled rows for processing, it was always by priority, and dateLastCreated within priority. We just as easily could have allowed 5 decimals so that a priority of 1.5 could be assigned between 1 and 2, 1.55 between 1 and 1.60, then 1.57, 1.575,1.576, 1.5755 etc. That was finer control than anyone ever needed.

    We're still trying to get you out of the old cursor mindset to see how we actually tackle problems.

    We're thinking "dig a big hole" and you still seem to be thinking "dig a big hole with a spoon" 😉

    Don't think this is just picking on you. LOTS of questions are asked here every day and the first response is "Why do you want to do that?" If someone tells us they want to jump off a cliff, we hesitate to give them directions.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Jeff Moden (3/18/2009)


    JacekO (3/18/2009)


    I would keep it simple. Deal with one record at a time. If your Stored Procedure is designed to work fine with one record you can call it sequentially with all the other records. The user interface's job should be to feed you the data the way you want it. If you design you database interface to accept one or two or more orders at the same time you are creating more problems than you probably are willing to solve.

    (And then you will have a reason to rant about SQL...)

    Although not simpler and only if you need to insert more than just a couple of rows, a sequence table with an increment as a parameter in the feed would probably work better. In SQL Server 2000, it would also require the use of a temp table, but it's much faster than RBAR methods. It also prevents hot-spotting, duplicated numbering, and deadlocks if the proc that does the work is written correctly.

    I might be wrong because I did not read carefully all the replies on this thread, but I think this is a typical one at a time input database application with a C# frontend. So introducing concepts similar to what you mention in your post will drown this guy before he has a chance to jump into the water. He should do as much as work as possible in C# and use the database to store data and do some basic manipulation like resorting his sequence numbers.

    I do agree with Sergiy that we bombard this poor soul with so many new ideas that it does not make sense anymore. Let him take baby steps now and then maybe one day he is going to run.

    I am a strong proponent of using the right tools to do the right job but here there is also the issue of the master.

    ---------------------------------------------
    [font="Verdana"]Nothing is impossible.
    It is just a matter of time and money.[/font]

  • STUFF

    There is a lot of stuff I have read, and want to address them for those that are curious.

    I have had extensive experience with set based queries in the past, and currently. Having experience with sql server since 1995, table locking and such things and multi-user server is not new to me at all. Although sometimes I struggle with certain concepts that I have infrequently used, (sub queries) for one. I have rarely had the need to use them.

    Business case for identity insert not being in a couple of my important tables. The reason is that I have used bands of identities, and certain types of data sources that will have identeties between 1 and 10million , 10 million and 20 million, 20 and 30. This is just one scenario.

    Another scenario is where my primary key is computername+'rowcount'. The reason I use this scenario is that I have 300 users that sync data back to the server (up and down). And so I have a function that figures out what the next rowcount that the next ID will be as they insert records into the database.

    I will address the rest in a bit.

    I finally have my computer back, and have been installing stuff all day. When I come accross the problem with updating multiple items (where sql was complaining) I will examine it and send it to the forum for those that are interested.

  • You obviously don't need my help then.

    I'd still advise you to just read, and follow, the Forum Etiquette[/url] article so that you completely lay out your questions in the future.

    What say we ask Steve to close this thread, and you can just post individual questions in separate threads like everyone else?

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • I have to concur with Bob, but I would still like to know if you still think that cursors are the only way to solve problems or if we have at least convinced you that set based solutions are better and do exist if you take the time to look for them. it would also be nice to see you complete that assignment Bob set out for you earlier, even if in another thread.

  • Steve Jones - Editor (3/17/2009)


    Steve Dassin,

    I haven't complained about the advertising of your product because it's free, and because you've only mentioned it in places where you appear to believe it fits the conversation. you're welcome to start a thread somewhere and debate it.

    Thank you. Your courtesy and judgment are appreciated.

    Steve Jones - Editor (3/17/2009)


    Language matters. The language you used offended people (not me), so I censored it for that reason. It's not a question of maturity to be offended by your choice of words. It's simply a question of attitude and feelings. Please choose different metaphors.

    Here's right back at you:-) I'm reassured that my judgment about you was correct. As someone who appreciates the written word and appreciates the effort that some put into it, I abhor censorship. But I do now realize that you take on the responsibility for the sensibilities of your readers. While it's not easy to paint a picture in words that will grab the reader when you're swimming upstream, I will certainly choose my metaphors with more discretion.

    best,

    steve

    www.beyondsql.blogspot.com

  • Lynn, you have completely and utterly convinced me to look for a set based solution to the problems that come as part of the job.

    Cudos for that!

  • As far as the sequence number being a requirement in a table of a db we are in the process of re-designing. The business case is:

    We have applications that go on vehicles (hoses, for instance).

    These hoses are applied to the vehicle in a certain order. Which is describe by a routing system like:

    pipe 1 to pipe 2, pipe 2 to alternator.

    These routed applications we publish in the Gates catalog (napa catalog, carquest catalog).

    You can probably find one of these at your part store.

    When our users populate this data, they need to put it in a sequence. We use the sequence to publish the data.

  • foxjazz (3/19/2009)


    Business case for identity insert not being in a couple of my important tables. The reason is that I have used bands of identities, and certain types of data sources that will have identeties between 1 and 10million , 10 million and 20 million, 20 and 30. This is just one scenario.

    It's not a busyness case. It's your implementation.

    Based on human approach to deal with data.

    For computer there is no difference between 5 mil and 15 mil. The difference is in your head, and I bet you hardcoded the logic in procedural code, just to make sure that your application will fail if number of items for one of the types exceeds 10 mil.

    Clustered primary key on (TypeID, IdentityCol) would do exactly the same, but with no need of procedural approach with hardcoded values.

    Another scenario is where my primary key is computername+'rowcount'. The reason I use this scenario is that I have 300 users that sync data back to the server (up and down). And so I have a function that figures out what the next rowcount that the next ID will be as they insert records into the database.

    You're probably using SQL for decades but you still did not learn basics about data storage.

    For example, 1st Normalization Rule.

    Even if somebody told you about it you don't use it anyway.

    The key like yours does not make any sense to computer system. It operates with binary values anyway, and 4 bytes of integer aren't any different than 4 bytes from computer name. Except varchar as PK adds extreme overhead for data processing.

    So, again, there is no business case for no using IDENTITY. It's only your choice, your preferences, based on procedural, human-like approach to deal with data.

    Relational databases are based on very good math model, you need to study math to some extent before you may be allowed to make design decisions in relational databases.

    Read normalization rules and try to ask yourself "why?" What are the grounds for each of them?

    It will probably direct you to what's missing in your education.

    _____________
    Code for TallyGenerator

  • foxjazz (3/19/2009)


    Lynn, you have completely and utterly convinced me to look for a set based solution to the problems that come as part of the job.

    Cudos for that!

    Thank you for the feedback. It is much appreciated. :blush:

  • Sergiy, the quote button is taking me to the wrong post, but you said

    Except varchar as PK adds extreme overhead for data processing.

    in your last post. Do you know of any good references to learn more about what you are saying? I have a VARCHAR(20) column as the PK for many of my tables, and I never thought twice about it.

    Greg
    _________________________________________________________________________________________________
    The glass is at one half capacity: nothing more, nothing less.

  • foxjazz (3/19/2009)


    When our users populate this data, they need to put it in a sequence.

    Why?

    We use the sequence to publish the data.

    I believe

    pipe1 1

    pipe1 2

    pipe1 3

    will be published in he same order as

    pipe1 1234

    pipe1 54321

    pipe1 4543345

    Number 4543345 is probably harder for you to figure out then number 3 (and me, of course, we both are humans) but for a computer they are both 4-byte binary values.

    _____________
    Code for TallyGenerator

  • [font="Verdana"]One common business case for generating sequences is for invoice numbers. For some reason, customers like to see their invoice numbers being counted just for them. So invoice numbers are often in the form: [XXXX]-[YYYYY] where X is related to the customer number and Y is the number of the invoice counting from one for that customer.

    That's one example. There are usually a myriad of reasons for generating sequence numbers, and they are almost always related to being "people friendly" numbers.

    Identities are not good choices for that. Of the other hand, that doesn't mean that they require a cursor.

    [/font]

  • Steve D, thanks for the reply

Viewing 15 posts - 286 through 300 (of 465 total)

You must be logged in to reply to this topic. Login to reply