Sub query expressions vs. Joins.

  • Jeff Moden (7/22/2010)


    Maybe that's true in Oracle but it's not true in SQL Server. Oracle is mostly rule based and SQL Server is cost based.

    I'm glad you commented this article, Jeff. Great discussion.

    A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

    -- Gianluca Sartori

  • cmapowers (7/25/2010)


    This seems to be a correlated sub-query. Does this apply to non correlated sub-query?

    All non correlated sub-queries break down to being an inline view whether they're in the form of a derived table in the FROM clause or a CTE reference in the FROM clause) and are typically joined as if a table. So the link to Gail's blog still applies.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Gianluca Sartori (7/26/2010)


    A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

    Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden (7/26/2010)


    Gianluca Sartori (7/26/2010)


    A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

    Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.

    :crying: You don't know how lucky you are...

    -- Gianluca Sartori

  • Gianluca Sartori (7/26/2010)


    Jeff Moden (7/26/2010)


    Gianluca Sartori (7/26/2010)


    A minor note on the Oracle thing: it used to be completely rule-based, but now it's entirely cost-based. Starting from version 10g the rule-based optimizer is no longer available.

    Thanks for the update on Oracle, Gianluca. It HAS been a while since I've had to work with it at such a level.

    :crying: You don't know how lucky you are...

    Heh... actually, I do. Better yet, one of the things I get to do at my "new" job is migrate some Oracle databases and related code to SQL Server. 😀

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thank you.

  • Why not the following?

    SELECT a.EmployeeID,a.Salary a.RActive

    b.ManagerName

    FROM Employees a (nolock)

    ,Managers b (nolock)

    WHERE b.ManagerID = 1

    AND a.ManagerID = b.ManagerID;

  • IG (1/20/2012)


    Why not the following?

    SELECT a.EmployeeID,a.Salary a.RActive

    b.ManagerName

    FROM Employees a (nolock)

    ,Managers b (nolock)

    WHERE b.ManagerID = 1

    AND a.ManagerID = b.ManagerID;

    NOLOCK can return inconsistent data (dirty reads, non repeatable reads and the like), but can also return duplicate data or no data at all for some rows. NOLOCK means that data is not locked at all, even from the query itself, that could end up reading the same row twice in case of a page split.

    If you are ok with inconsistent data, then go on and use NOLOCK, just make sure your users are ok with inconsistent data as well.

    -- Gianluca Sartori

  • I pity any already confused developer who reads this article. As, despite the title, it isn't about sub-queries. What the author refers to are called Derived Tables, Table Expressions (which is where Common Table Expressions get their name from) or sometimes (almost equally confusingly) Inline Views. Any clues - the execution plan has no SEMI-JOIN in it. Refer to Chapter 6 "Subqueries, Expressions and Ranking Functions" of the IBG SS2008 T-SQL Querying book for reference.

    Sorry to be negative, but one of Paul Randall's top myths busted is "sub-queries are always slower". And, this kind of confusing article just adds to the myth.

    But, if it was titled Table Expressions vs Joins :exclamation:

    This DBA says - "It depends".

  • I can honestly say that I have never seen subqueries used as such, these queries have always been of the inner join variety. And judging by the discussion from months ago, this will be something I remember in the future.

    Anyways, I have seen a variation of the "standard" inner join

    SELECT EmployeeID, Salary, RActive, ManagerName

    FROM Employees

    INNER JOIN Managers

    ON Manangers.ManagersID = Employees.ManagerID

    WHERE Employees.ManagerID = 3

    In which the WHERE is eliminated and is replaced with using an AND on the JOIN portion

    SELECT EmployeeID, Salary, RActive, ManagerName

    FROM Employees

    INNER JOIN Managers

    ON Manangers.ManagersID = Employees.ManagerID

    AND Managers.ManagerID = 3

    On larger datasets I have seen better performance with the second version, but on smaller I see no performance advantage.

    I did do a test on a Products:Categories relationship on one of the DBs I have mounted, where there most populated category had about 50 items. I used the subset and the 2 variations of the join on both an '08 box and a 2K instance.

    The results were fairly equal, with the subset at 34% and the joins at 33% apiece. This was consistent between both server instances

    Director of Transmogrification Services
  • After some years of observation I found if 2 queries are similar the optimizer ill process both in a similar way.

    It means its likely there's no difference in performance for most joins x sub-queries scenarios.

    And also it's mean

    If you found differente query plans it's likely the queries are different and that's the case here.

    Put yourself in the compiler place.

    You can read the second query as:

    fetch manager-1 (a single clustered index seek)

    now fetch me all employers managed by manager-1 (40k index seek)

    First query reads as:

    fetch me all employers managed by manager-1 (40k index seek)--same cost as the other way

    and for each fetched employer seek if there's a manager for him and fetch it as well.

    Since its not a strict inner join its possible there are rows where the id is null.

    The compiler still try (futile) to optimize by using a lazy spool.

    I cannot test by now but I guess a FK can improve the subquery.

  • I had to use subqueries for 1-to-many tables where I needed to only return only 1 row for each record (Select top 1 date from InnerTable joined to OuterTable order by date desc) as [LastDate], thereby the results become 1-to-1.

    This was noticable slow; but I don't believe there is an alternative (*??*) with JOINs without either missing out records from the primary table where the is no join or returning too many rows (LEFT JOIN) with a lot of null values from forgein table.

    Might CTE's be the answer instead? I keep meaning to learn about them..

    - Gav.

Viewing 12 posts - 46 through 56 (of 56 total)

You must be logged in to reply to this topic. Login to reply