Performance Tuning: Concatenation Functions and Some Tuning Myths

  • Ian Crandell (1/9/2008)


    Jeff Moden (1/8/2008)


    We've actually put some "documentation standards" into play at work. They're not so limiting as to suppress innovation, but they do state what must be documented within the code. Basically, it's the way I've always written code because I forget stuff (done... next problem please!). 😉

    I know I have seen some of your commenting in your posts, but could you provide some of the "documentation standards" you use. After working with some of our legacy stored procedures, I would like to try to implement something like that at my office.

    Also, in the article there was a code block that didn't show up properly. It was after

    The next thing folks might try, is a full "covering" index...

    Could you provide what was put there?

    Thanks,

    Ian, my apologies for the delay in this matter... please see the attached. Also, keep in mind that "XXX" throughout the document is where the company name went.

    We were also going to release another minor update, but other priorities got in the way.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (2/12/2008)


    Ian, my apologies for the delay in this matter... please see the attached. Also, keep in mind that "XXX" throughout the document is where the company name went.

    We were also going to release another minor update, but other priorities got in the way.

    What, you mean you aren't just sitting around with nothing to do like I am... (sorry, I just can't say that with a straight face:hehe: ).

    Seriously though, thanks for the document (besides a 1 month turn around for something in IT is really good ;)). I will have to see how much of this I can get implemented at my office. If anything, I will keep it in my toolbox. I see some things that are going to force me to do some learning, which is cool.

    Thanks,

    Ian.

    "If you are going through hell, keep going."
    -- Winston Churchill

  • Cool... thanks for the feedback, Ian.

    The way the standards were created is kind of interesting... after a year at my current company, I saw enough crap and ran into enough problems to literally write a book about it. Every single item in the standards is from an actual problem in someone's code at work. For example, you'll run into a standard about how updates with a join should be written (object of the update must be included in the FROM clause)... that standard came about when an update that should have taken 20 seconds, slammed 4 CPU's into the wall for 2 hours.

    Some standards like the one about not updating an "alias" is meant to do two things... 1 is to force the developers to write code that will never force self recompilation for every row updated and to make it easier to search for tables in SysComments that are being updated because of all the cruddy RBAR everyone had in their code. That's one of the changes we were going to make... we were going to start allowing the object of an Update to be an alias just to simplify some code.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Hello Everyone,

    I'm getting results back in my examples that the function query is actually faster than the XML query.

    The following function query returns in a round trip of 9 seconds.

    SELECT SomeID,

    dbo.fnConcatTest(SomeID) AS CSVString

    FROM dbo.TestData

    GROUP BY SomeID

    The following XML and STUFF() query returns in a round trip of 10 seconds.

    SELECT t1.SomeID,

    STUFF(( SELECT ',' + t2.SomeCode

    FROM dbo.TestData t2

    WHERE t1.SomeID = t2.SomeID

    FOR XML PATH('')),1,1,'') AS CSVString

    FROM dbo.TestData t1

    GROUP BY t1.SomeID

    The following XML and data() query returns in a round trip of 12 seconds.

    SELECT t1.SomeID,

    REPLACE (( SELECT t2.SomeCode AS 'data()'

    FROM dbo.TestData t2

    WHERE t1.SomeID = t2.SomeID

    FOR XML PATH('')), ' ', ',') AS CSVString

    FROM dbo.TestData t1

    GROUP BY t1.SomeID

    Are these queries' times in the same order with your results?

    Regards,

    Ryan

  • What are you using to measure the times with? Typically, I'll use SET STATISTICS TIME ON... the CPU time is the actual time it took sans any display. The Duration includes the time to display.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • On the 1,000,000 row example with clustered and nonclustered index I get 16.7 seconds with this one:

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    GO

    SET STATISTICS TIME ON

    GO

    SELECT d.SomeID, dbo.fnConcatTest(d.SomeID) AS CSVString

    FROM (

    --==== Derived table "d" finds the unique SomeID's first

    SELECT DISTINCT SomeID

    FROM dbo.TestData

    ) d

    This drops to 15.8 seconds if I replace the DISTINCT with GROUP BY SomeID in the subquery. I get 10.6 seconds with this one:

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    GO

    SET STATISTICS TIME ON

    GO

    SELECT t1.SomeID,

    REPLACE ((

    SELECT t2.SomeCode AS 'data()'

    FROM dbo.TestData t2

    WHERE t1.SomeID = t2.SomeID

    FOR XML PATH('')), ' ', ',') AS CSVString

    FROM dbo.TestData t1

    GROUP BY t1.SomeID

    And 9.3 seconds with this one:

    DBCC FREEPROCCACHE

    DBCC DROPCLEANBUFFERS

    GO

    SET STATISTICS TIME ON

    GO

    SELECT t1.SomeID,

    STUFF((SELECT ',' + t2.SomeCode

    FROM dbo.TestData t2

    WHERE t1.SomeID = t2.SomeID

    FOR XML PATH('')),1,1,'') AS CSVString

    FROM dbo.TestData t1

    GROUP BY t1.SomeID

    Of course YMMV.

  • I just knew this article would come in handy.

    And you see, it only took 52 days before I got to it. :w00t:

    I used the stuff example, because there was a requirement of

    a devteam to generate dynamic routing-numbers based on the

    contatenated MachineIds of ceveral rows for a certain orderid,

    ordered by a sequence number.

    Thanks again for sharing the knowledge :smooooth:

    Johan

    Learn to play, play to learn !

    Dont drive faster than your guardian angel can fly ...
    but keeping both feet on the ground wont get you anywhere :w00t:

    - How to post Performance Problems
    - How to post data/code to get the best help[/url]

    - How to prevent a sore throat after hours of presenting ppt

    press F1 for solution, press shift+F1 for urgent solution 😀

    Need a bit of Powershell? How about this

    Who am I ? Sometimes this is me but most of the time this is me

  • Thanks for the feedback guys...

    Ryan and Mike C... thanks for the performance comparisons. That's good stuff. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (2/22/2008)


    Thanks for the feedback guys...

    Ryan and Mike C... thanks for the performance comparisons. That's good stuff. 😉

    Sorry to resurrect this - but I came across a caveat the other day with the FOR XML method listed above...

    It tends to "HTMLize" certain characters.

    For example, if your character contains "&" as a character... it gets turned into &amp;. Makes a little sense, since regular XML would do that, but can make for some nasty surprises if you don't expect it. I didn't get a change to try, but I'd imagine it may have trouble with < and > as well.

    Now - you CAN run replace on the result - but that's going to slow stuff down.....

    Caveat Emptor.....

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • That shouldn't be a problem with comma-separated lists of numbers, but for character data it could present a problem. You might be able to use the TYPE directive to avoid this, but I don't have a chance to test it fully right now.

  • Matt: I've run into that very problem on more than one occasion. Sometimes I do a replace() to fix the problem. Sometimes I can change the text to avoid the encoding (?not sure if that's the right term) text.

    Here's another trick. What if you want a line return between each row instead of a comma? You might be tempted to use the following to create a new row:

    SELECT Char(13) + Char(10) + MyColumn ...

    But if you do that, you get this garbeldyguk at the end of each line: "&#x0D;" Ugh. The trick is to do the Char(10) by itself without the Char(13). Then you get line returns without the added text. This may be obvious to people, but it wasn't to me. It took some playing to figure it out. So, I thought I would share.

  • Mike: I'd be interested to know if you ever get the TYPE option to work. I had read about it and tried it, but it never worked for me. Then again, I may have been doing something wrong.

  • JJ B (3/19/2008)


    Mike: I'd be interested to know if you ever get the TYPE option to work. I had read about it and tried it, but it never worked for me. Then again, I may have been doing something wrong.

    I just tried it and it didn't change the outcome 🙁 I'll play around with it later and see if I can get it working. It may end up being a case where REPLACE is required on the 5 entities:

    &lt;

    &gt;

    &quot;

    &apos;

    &amp;

  • Mike: Thanks for looking into it. Glad to know I didn't miss something terribly obvious.

    re: "replace required on certain characters. "

    I would add: & # x 0 D ;

    as characters that may need replacing.

    (without the spaces. I can't figure out how to get this to display as-is. What did you do to get the above encodings to appear as-is?)

  • JJ B (3/20/2008)


    Mike: Thanks for looking into it. Glad to know I didn't miss something terribly obvious.

    re: "replace required on certain characters. "

    I would add: & # x 0 D ;

    as characters that may need replacing.

    (without the spaces. I can't figure out how to get this to display as-is. What did you do to get the above encodings to appear as-is?)

    I usually "con" it by "bolding" one of the characters in the string. by having the tag in the middle of the string - HTML doesn't "kick in " and encode it.

    As in - (x is bolded)

    &#x0D;

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 15 posts - 61 through 75 (of 82 total)

You must be logged in to reply to this topic. Login to reply