Comparing Table Variables with Temporary Tables

  • RBarryYoung (6/11/2009)


    What #temp table overhead are you talking about? One of the tings that Wayne's article reveals is that temp table and table variables have *almost* identical overhead.

    The real question isn't "at what point the cost of the recomplications outweighs the overhead of #temp tables", rather it's "to what extent does using inaccurate statistics instead of no statistics compensate for the cost of recompiling?"

    [font="Verdana"]I was wondering that too. I can understand the overhead of table variables (no statistics, no parallelisation.)

    By the way, a very nice little article Wayne. Well done. I'd love to see some detailed discussion of how to use the table type with SQL Server 2008 and passing around table variables as parameters versus (say) passing the same data row by row or as an XML string and the performance implications. That's not a criticism or limitation of your article, more just that your article whetted my appetite. πŸ™‚

    [/font]

  • Tim Walker (6/10/2009)


    ...what is the point in defining a unique constraint on a table variable? Won't the 'optimized assuming one row in a table variable' render this unnecessary because the optimizer will consider the value in the column is going to be unique anyway?

    Without a unique index the optimizer estimates one row. With a unique index, QO will still estimate one row, but it now knows that any rows produced will be unique. This may be useful when optimizing the remainder of the plan.

    The considerations for using a PRIMARY KEY or UNIQUE constraint on a table variable are therefore broadly similar to those used with 'real' tables. Use a UNIQUE constraint if (for example) a column or combination of columns is required to be unique, or the fact that it is unique will help the optimizer pick a good plan, or the enforcing index is useful as an access path later on.

    Tim Walker (6/10/2009)


    I've noticed that when more than one multi-row table variable is involved in a query the optimizer tends to table scan one and repeatedly table scan the other. This can get really expensive in CPU and is best avoided.

    It depends. Yes, the QO will almost always choose nested loops as the join operator when a table variable is involved. This is even more likely if the join is between two or more table variables.

    In most usual circumstances this is a good choice - if the number of rows is small, or if the number of rows on the outer side of the join is relatively small and there is a useful index on the inner side. In the case with the index, performance can still be good for larger sets if the index is covering, if the key/RID order has a correlation to the driving index order, or if prefetch is used.

    If it is known beforehand that the characteristics of the table variables would likely suit a hash or merge join better, this can be forced with a join hint (e.g. INNER HASH JOIN) or query hint (e.g. OPTION (MERGE JOIN)), though in many cases switching to temporary tables will be the better option.

    There are scenarios where using a table variable with a larger number of rows can be useful. One example would be in a complex query where the statistics available on a temporary table would lead the QO to produce a sub-optimal plan, maybe because of some characteristic of the data which is obvious to the DBA but not clear from the statistics. Sometimes, using a table variable here can give the QO the 'hint' that few rows are expected, leading to a better plan.

    This last example can be a way to simulate a local 'row goal' such as one might have attempted in the past with a FASTFIRSTROW table hint. That hint is deprecated now I think and apparently identical to OPTION (FAST 1) - which is not the same thing at all. Using a table variable can be a cool way to help the QO understand that part of a plan is expected to produce few rows, or should be treated as such for optimization purposes.

    Cheers

    Paul

  • sscddr (6/11/2009)


    The article should discuss the wonders of "option (recompile)".

    It would be a useful addition, yes. Of course OPTION (RECOMPILE) is not a panacea - recompiling for every execution may not be desirable. Often, a well-placed hint will do the job just as well.

    sscddr (6/11/2009)


    It fixes almost all query plans with large table variables without having to use "force order". We use table variables for nearly everything.

    Really? Everything in moderation I say! Sure there are times when table variables are a good choice. Other times a temporary table will be better. It depends. Consider the potential benefits of statistics, sensitivity to rollbacks, and parallelism for example. If used in a procedure, traditional temporary tables can also be cached with the plan (one page of data and one IAM page) which means that they are not actually 'created' as such after the first call (terms and conditions apply).

    sscddr (6/11/2009)


    We now only use "force order" to stop the occassional "lazy spool" (wish I could disable that "feature" globally).

    You wouldn't want to do that. If you think lazy spools are always bad, you are missing the point.

  • Paul White (6/11/2009)


    ... Everything in moderation I say! ...

    Everything except moderation, I presume? πŸ˜€

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (6/11/2009)


    Paul White (6/11/2009)


    ... Everything in moderation I say! ...

    Everything except moderation, I presume? πŸ˜€

    Gah! πŸ˜€

  • I mean, I'm all for moderation, but let's not get carried away! πŸ™‚

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • [font="Verdana"]So you're saying everything in moderation, including moderation?[/font]

  • Bruce W Cassidy (6/11/2009)


    [font="Verdana"]So you're saying everything in moderation, including moderation?[/font]

    Gah! πŸ˜€

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • Amazing, Outstanding, Awesome

    What else is there to say? Maybe one thing….a trick you can use to add an unique index on fields that are not unique.

    1. Add an identity column

    2. Add an unique constraint on the fields you want to index with as the last field the identity column.

    Forgive me if it was already in the article, it was after all a very thorough one :).

  • actually...

    "Moderation in all things, especially moderation."

    β€” Ralph Waldo Emerson

  • Andy Lennon (6/12/2009)


    actually...

    "Moderation in all things, especially moderation."

    β€” Ralph Waldo Emerson

    Yeah, but Emerson was a radical moderate.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • RBarryYoung (6/12/2009)


    Andy Lennon (6/12/2009)


    actually...

    "Moderation in all things, especially moderation."

    β€” Ralph Waldo Emerson

    Yeah, but Emerson was a radical moderate.

    this is getting surreal.

    i'd always considered him more of a moderate radical. :hehe:

  • Andy Lennon (6/12/2009)


    this is getting surreal.

    Fish.

  • Paul White (6/12/2009)


    Andy Lennon (6/12/2009)


    this is getting surreal.

    Fish.

    Mmm, not really my cup of fur.

    [font="Times New Roman"]-- RBarryYoung[/font], [font="Times New Roman"] (302)375-0451[/font] blog: MovingSQL.com, Twitter: @RBarryYoung[font="Arial Black"]
    Proactive Performance Solutions, Inc.
    [/font]
    [font="Verdana"] "Performance is our middle name."[/font]

  • I'm trying to get some clarification as to where that row estimate comes from. It's not from statistics, there's no auto stats events fired and no stats visible in TempDB

    --------------------------------------------------------------------------------

    Gail Shaw

    Gail, this is a shot in the dark, but have a look at the relationship between the actual (17134) and estimate (5623). This is roughly 30%, which corresponds to the selectivity estimate for non-equijoins when there are no statistics. My guess is that the normal 10% guess for equijoins must have been modified for the table variable case; if so, it would be fascinating to see the test cases that led to that decision. Maybe the modified magic number could tip the optimiser away from necessarily going for a loop join and towards a hash join if the sheer number of rows justifies it.

    -- Ewald Cress

Viewing 15 posts - 46 through 60 (of 163 total)

You must be logged in to reply to this topic. Login to reply