Should I use # tables?

  • This is more a theoretical debate than an actual problem but I have an SP where I reference the same set of records from a table 4 or 5 times during the proc e.g.

    select *

    from table t1

    inner join account acc

    on t1.account_id = acc.account_id

    and acc.column = @parameter

    select *

    from table2 t2

    left outer join account

    on t2.account_id = acc.account_id

    and acc.column = @parameter

    where acc.account_id is null

    (there are other sql statements using the account table)

    The question is whether I should put the set of account records into a temporary table and use the temp table in the SQL statements or not? If I put the rows into a temp table there is one hit on the big accout table and several trawls of the # table (with an appropriate index) or should I hit the big account table (with approriate index) every time?

    Depending on the process I am running there might be 5,000 account records or there might be 500,000 account records. There are about 2 million rows in the account table.

    Jez

  • From your description, the temp table sounds better, providing there are appropriate indexes on it. Depends on tempdb space, activity, memory size, ...

    My normal answer to questions like this is 'Try it both ways, and see which works better' 😀

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Shouldn't you use table variables rather than temp tables? Or are there disadvantages to using table variables?

    From BOL it says the following:-

    table variables used in stored procedures result in fewer recompilations of the stored procedures than when temporary tables are used.

    Transactions involving table variables last only for the duration of an update on the table variable. Thus, table variables require less locking and logging resources

    Pete

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

  • the biggest downside to table variables is that indexes cannot be aded to them (apart from a primary key) and that they don't keep data distribution statistics.

    The second is often the bigger downside. The optimiser has no way of knowing how many rows are in a table variable. It estimates 1. If the number of rows is low, that's fine, if the number of rows is high, it leads to very inefficient plans and very poor query performance.

    My normal recomendation is to keep table variables below 100 rows.

    Table variables, like temp tables, are kept in memory where possible, but if they get large or the SQL server is experiencing meory pressure, they can both be flushed to disk. They both have entries in the tempdb system tables and are allocated space within that DB.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • table variable vs #table- possible but you cannot create an index on a table variable.

    But the question remains - should I extract the records I need from a large table into a short lived alternative storage mechanism and then use that in joins rather than going back to main table.

    My testing suggests that it is beneficial but I was trying to canvass others for ideas.

    Jez

    Gail beat me with the table index bit.

  • Would indexed views help?

    I'm more used to creating Data Warehouses, and I would always create a 'real' table during the staging process rather than a temporary table as the volumes are usually quite large.

    Pete

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

  • In the past I have used temp tables or table variables in this situation reasoning that it is usually going to be better to hit the big table once. As I have worked longer with SQL Server I would more likely now to try to find a way to write the stored procedure so I only have to hit the account table once without creating a temp table or table variable. Of course this may not be possible. I would think that your later queries of the accounts table would be accessing cached data so it may perform better than the table variable or temp table, if you have to write these to disk.

    Basically I agree with Gail when she said:

    My normal answer to questions like this is 'Try it both ways, and see which works better'

    Jack Corbett
    Consultant - Straight Path Solutions
    Check out these links on how to get faster and more accurate answers:
    Forum Etiquette: How to post data/code on a forum to get the best help
    Need an Answer? Actually, No ... You Need a Question

  • Jez (11/19/2007)


    table variable vs #table- possible but you cannot create an index on a table variable.

    But the question remains - should I extract the records I need from a large table into a short lived alternative storage mechanism and then use that in joins rather than going back to main table.

    It depends. If the effort to recreate this set of data is high enough (or would push one of the queries using said data past the resource point), then yes - it can be beneficial. Of course - the thing to remember is that once you extract them, all of the benefits of the "permanent table" (statistics, PK/FK. indexes) are history unless you build them, so you'd be building queries against heaps, and/or evaluating the cost of having to rebuild those things you need. So the question then becomes whether the overal process is more efficient being run against a "small heap" (extract the data to something temporary and don't index it), a small "indexed set" (meaning - you've now indexed the extract, and all of the effort you put into doing that will be repeated each time), or direct data access in the "big table".

    There's also a 4th option. if this SP won't be run in multiple concurrent threads - you might consider a (permanent) "working table", i.e. a permanent table designed specifically to hold your data temporarily (so the indexing is already done, etc...).

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply