Extended challenge for everyone - Answer

  • This is the answer to the challenge I posted HERE

    For those of you who didn't see the post, or actually didn't run the code (as I asked), here's the kicker.

    While I could probabely write an article on this, I will simply post a short answer to this question.

    So what does this do (slightly modified)?

    if object_id('Test') > 0

    DROP TABLE Test

    GO

    CREATE TABLE [Test] (

    [id] [int] NOT NULL ,

    [Name] [sysname] NOT NULL ,

    [Colid] [smallint] NOT NULL ,

    [Colid2] [int] NULL ,

    [Colid3] [int] NULL ,

    [LastId] [int] NULL ,

    CONSTRAINT [PK_Test] UNIQUE CLUSTERED

    (

    [id],

    [Colid],

    [Name]

    ) ON [PRIMARY]

    ) ON [PRIMARY]

    GO

    Insert into Test (id, name, Colid) Select Distinct C.id, C.name, C.Colid from dbo.SysColumns C inner join dbo.SysObjects O on C.id = O.id where O.XType not in ('FN', 'IF', 'TF')

    GO

    Declare @Int as int

    set @Int = 0

    declare @LastId as int

    set @LastId = -1

    Declare @WasParam as bit

    set @WasParam = 0

    update dbo.Test set @Int = Colid2 = case When @LastId <> id THEN 0 ELSE @Int END + 1, LastId = @LastId, @LastId = id

    GO

    Update T set T.Colid3 = dtColid3.Colid3 from dbo.Test T inner join

    (Select id, name, (Select count(*) as Total from dbo.Test T2 where T1.id = T2.id and T2.colid <= T1.colid) as Colid3 from dbo.Test T1) dtColid3
    on T.id = dtColid3.id and T.Name = dtColid3.Name
    GO
    Select * from Test where Colid2 <> Colid3

    --0 row(s) affected

    --proves it gives the same results

    Select * from Test

    DROP Table Test

    GO

    While almost everyone can figure out that I'm creating a table, shipping data from syscolumns, Declaring/setting variables, updating the table and then dropping it after showing the update, I haven't had anybody explicitly say what the update really did. A few did say that I was incrementing colid2 which is true, but incomplete.

    From BOL, you can learn that you can use update to not only set column values, but also variables (@something = SomeCol). While this doesn't seem really usefull, it gives an opportunity to create a query that can replace a task that would require 2 while loops on the client, or a fairly complexe Tsql query.

    Here's a sample of the final select :

    id lastid colid coli2 name

    6 6 9 9 compressed

    6 6 10 10 text

    8 8 1 1 status

    8 8 2 2 fileid

    8 8 3 3 name

    8 8 4 4 filename

    9 9 1 1 id

    9 9 2 2 grantee

    As you can see colid2 has the same value as colid (except for some tables that have some deleted columns in them). So even though @LastId is set to the current id, the case statement is evaluated before that set operation is done. This allows for the counter to be reset to 1 (0+1), and therefore giving a new identity seed for each group of columns.

    While this is certainly not something I would use in production because the "default" order by could be changed if the clustered index is changed, or a new index is added coupled with the fact that I can't order by the update without a join to get a garanteed order. This is something I would defenitly use if I had to add something like an item number in a OrdersDetails table (to correct, or create a crappy design I guess ).

    This is just something I wanted to show to demonstrate how much a set based approach can be immensly more powerfull than a procedural approach.

    If you want to compare, run these 2 queries (2nd is what I normally use). The first update statement runs 3 times faster, with 100 times less reads, taking only 18% of the batch!!!!!

    set statistics io on

    dbcc dropcleanbuffers

    go

    Declare @Int as int

    set @Int = 0

    declare @LastId as int

    set @LastId = -1

    Declare @WasParam as bit

    set @WasParam = 0

    update dbo.Test set @Int = Colid2 = case When @LastId <> id THEN 0 ELSE @Int END + 1, LastId = @LastId, @LastId = id

    go

    dbcc dropcleanbuffers

    go

    --close to a procedural approach

    Update T set T.Colid3 = dtColid3.Colid3 from dbo.Test T inner join

    (Select id, name, (Select count(*) as Total from dbo.Test T2 where T1.id = T2.id and T2.colid <= T1.colid) as Colid3 from dbo.Test T1) dtColid3
    on T.id = dtColid3.id and T.Name = dtColid3.Name
    go
    set statistics io off

    Your 2 cents??

  • well you lost me pretty early on there!

     

    i guess that's why i always have the problems, and you always have the answers haha!

  • Where did I lose you?

  • Btw this is the simple exemple... I was successfull at avoiding the problems with udfs. Long story short is that tables functions have 2 sets of columns in SysColumns. One of 'em is for the input parameters, and the other one is for the output (selected) columns. They each get a seed at 1 then increment as you add columns. Not to mention that scalar functions have a return parameter that starts the colid at 0 instead of 1. Needless to say that I had a few case statements in there... so I just decided to remove those lines from the select, keeping the exemple easier to understand (so to speak )

  • You lost me at hello

    I'm following you a little bit now that you explained it. I'm just trying to imagine when I could use it in a real world situation.

  • This is just to expand your mind to ultra set based approach. This is no novice query .

  • You callin me a novice?

    This is a very different approach that the way I usually think about these things. I like thought expanding questions. Keep it up!

  • I said NOT FOR NOVICES.... it took me a fews hours of playing with many different variations to make it work (with the udf problem). It's really not the same way of thinking we're used to .

  • This is one of those scenarios where I think I prefer the definite correctness. And for which I'm very glad about the 'row_number()' function in SQL2005.

    Rob Farley
    LobsterPot Solutions & Adelaide SQL Server User Group
    Company: http://www.lobsterpot.com.au
    Blog: http://blogs.lobsterpot.com.au

  • Like I said, but it's nice to know it's out there...

  • great way to proove there is always more than one way to skin a kat

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Yup... but this sure kicks but compared to all the other ways we have in sql server 2000. Yukon will handle this even easier.

  • Hi Remi,

    This has been great reading. I didn't execute the first code snippet as you said (though I did paste it into QA, seperate and comment it all so I could see what it was doing.....)

    I have always struggled to get my head properly around the set based approach that everyone keeps trapping on about on here and this has been a very good example for me, any more wouldn't go astray

    Keep up all the good work everyone and lets keep this the definitive web site for Sequel Server.

    Enjoy the weekend and most importantly:-

    Have fun

    Steve

    We need men who can dream of things that never were.

  • I had my fun imagining some of you guys scratching your head on this one like I did.

    Still it's nice to see how 2 different set based approach can have such a drastic speed difference... I would like to see the difference if someone was executing this client side with 2 nested loops and doing one update for each line .

    On seconds thaughts... I'd rather not think about it .

  • Nice avatar btw.. did you make it yourself?

Viewing 15 posts - 1 through 15 (of 31 total)

You must be logged in to reply to this topic. Login to reply