SQL - Derived Tables

  • Thanks for this article. I learned something new and I will be using derived tables in the future. I would not consider myself an expert but I am wondering if you could handle this scenario using a case statement as follows:

    SELECT C.CustomerID,

    C.CompanyName,

    Sum(Case when YEAR(O.OrderDate) = 1996 then 1 else 0 end) AS TotalOrders

    FROM Customers C

    LEFT OUTER JOIN Orders O ON

    C.CustomerID = O.CustomerID

    My questions are:

    1. would this produce the correct results?

    2. are there efficiency/index issues with this approach?

    Thanks for your help

  • Microsoft documentation suggest that both derived tables and temporary table

    will first use memory cache if available.

    It then will use the temp database as needed in both cases depending on available cache.

    I think the advantages and disadvantages are more dependant on other issues

    such as indexing etc than the fact of using the temp database.

    This is a confusing and often misrepresented area of discussion.

    Jim

  • jldsql (1/16/2008)


    Microsoft documentation suggest that both derived tables and temporary table will first use memory cache if available.

    Jim

    Jim... First, I absolutely agree with what you said... But, I also believe that when you're going to blow up someone's article by dropping the "Microsoft documentation" bomb on them, the courteous thing to do is to provide at least one reference URL (or BOL hint) that supports your claim that "Microsoft said"... like I did in my first post on this thread.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks for the article. I know I don't use derived tables enough and now I feel armed to do so.

  • Derived tables are cool as far as being able to write elegant code and reduce the quantity of code. I started using them for these reasons.

    I have been removing them since I realized that they are performance dogs for decent-sized tables. Table variables or temporary tables are much better from a performance aspect.

  • Good article. I use derived table a lot in my query.

  • Interesting and informative.

    However, I was very curious about the last section of the article where an update statement is executed on a derived table. Where are the results stored? How do you get to the data after the update is completed?

    Good articles stir up good discussions as this article confirm. Keep these articles coming!

  • I'm also confused about the update statement...what does it have to do with the rest of the article?

  • Reginald J Ray Jr (1/16/2008)


    Derived tables are cool as far as being able to write elegant code and reduce the quantity of code. I started using them for these reasons.

    I have been removing them since I realized that they are performance dogs for decent-sized tables. Table variables or temporary tables are much better from a performance aspect.

    Have you tested them fully, or just gone by Estimated Cost? Table variables have a problem with the estimated cost in execution plans, and often give a lower number than what they would have if the server knew how many rows they would have before-hand (table variables don't have statistics in them, so they are estimated in cost as if they were 1 row).

    The reason I ask is because in a large number of cases, using Profiler and other tracking and testing tools, I've found that derived tables are sometimes faster and less resource intensive than table variables or temp tables, and sometimes are slower and more resource intensive. Your statement implies that they are always worse. Actual performance seems to depend mostly on indexing and the number of joins being used.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • I am going by how long the procedure or function takes to run, with resources available to the server being basically the same for the 'derived table' run and the 'table variable or temp table' run.

    Replacing derived table(s) with table variables or temp tables has cured lengthy run time problems for me. These problems occurr when the tables accessed have large numbers of records.

    I do add primary keys and / or indexes to the table variable or temp table when possible. This may be a significant part of the performance advantage I have observed.

  • Be careful in your reasoning: clearly in some cases you can replace a temporary table with a CTE and boost performance. This is what you are describing and quite well. However do not jump to blanket conclusions just yet - in many other cases you will dramatically hurt performance when you replace a temporary table with a CTE, and you fail to mention such cases.

    Cheers,

    Alex Kuznetsov, SQL Server MVP.

  • This thread illustrates that there are many ways of writing exactly the same query.

    I find that derived tables are fine up to a point, beyond which temporary tables perform better.

    It isn't an exact science but I find that table variables work well with very small volumes of data, derived tables have similar performance characteristics and tempdb tables have efficiency benefits on large volumes of data.

    The point about the YEAR() function is that a function on a field in the WHERE clause means that SQL cannot use any index on the OrderDate. The query will fall back to row-by-row processing.

    If there is a clustered index on OrderDate then BETWEEN '2006-01-01T00:00:00.000' AND '2006-01-01T23:59:59.997' is very efficient. The funny date format is ISO8601 format and SQL seems to cope with it much more consistently across different installations than other formats.

  • derived table concept is very usefull from seeing performance issues

    it is very good article

    i was using derived table , but my current company not advicing me to use, since here using temp tables

  • Hi,

    thanks for the article, and some comments here were also useful.

    (frankly I was using the derived table in some case in order make sql code a bit clearer, but I didn't know it was a called a "derived" table 😉 )

    for the question or remarks about the update part, some of us still use sql 2k, so knowing that an update is impossible is important to know.

    The sample in the article is simply that: a sample. It has no other value than to test the function and prove the point. (no need to propose a different way to do it :P)

  • Christophe, an update to what? Perhaps a subset of a table where the subset is defined thru a derived table rather than a where clause? Honestly, I'm just trying to understand and move on.

Viewing 15 posts - 16 through 30 (of 34 total)

You must be logged in to reply to this topic. Login to reply