JOIN question

  • Hi there,

    I have a question about join statement. The following are the different sql statements which I modified based on my text book for my experiments.

    -- old style OUTER JOIN

    SELECT o.name 'table', ISNULL(i.name, '-') 'clustered index'

    FROM sysobjects o, sysindexes i

    WHERE i.id =* o.id

    AND i.indid = 1

    AND o.type = 'U'

    -- subquery style

    SELECT o.name 'table', ISNULL((SELECT name FROM sysindexes i WHERE i.id = o.id AND i.indid = 1), '-') 'clustered index'

    FROM sysobjects o--, sysindexes i

    WHERE o.type = 'U'

    -- ANSI style (Recommeded style from my book)

    SELECT o.name 'table', ISNULL(i.name, '-') 'clustered index'

    FROM sysobjects o

    RIGHT JOIN sysindexes i ON o.id = i.id

    WHERE i.indid = 1

    AND o.type = 'U'

    The problem is that the ANSI style statement doesn't return the same result as the top 2 statements. What's wrong with the 3rd statement? How do I modify it to return same as the top 2? Anyone can help?

    Thanks,

    AC

  • I think you want you last query to look something like this. Basically you want to return all sysobjects of type = 'U' (left join with where o.type = 'U') and if they have a clustered index return the index name (i.id = o.id and i.indid = '1'):

    SELECT o.name , 'table', ISNULL(i.name, '-') 'clustered index'

    FROM sysobjects o

    left JOIN sysindexes i ON i.id = o.id and i.indid = '1'

    WHERE

    o.type = 'U'

    Gregory Larsen, DBA

    If you looking for SQL Server Examples check out my website at http://www.geocities.com/sqlserverexamples

    Gregory A. Larsen, MVP

  • My Question is why it is not showing the tables which does not have indexes, if i.indid = '1' is in the where clause.

    I might be wrong but to my understanding if i.indid = '1' is in the Join condition or in the where clause it should give the same results.

  • No, it will give different results. This is something simonsabin commented on in another thread. When you put a condition in the WHERE clause that references a column in the outer table of an outer join, then the query engine will convert the query plan to an inner join, not an outer join. Instead, you should place conditions that refer to the outer table into the JOIN clause.

    The reasoning behind this is simple. Consider the basic process of executing a query - conceptually (ignore the fact that physically, SQL Server might follow a slightly different process). First, your data is selected from the tables specified in the from clause. Then, the data is joined, based on whatever conditions you specify. Then, the data is filtered, based on whatever you specify in the WHERE clause. Then - ignoring cases where you have aggregate clauses - the data is returned (projected) according to your SELECT clause. Now, let's say you perform an OUTER JOIN, and in your WHERE clause you specify a condition that references the outer table. So first, the query engine gets all of the data from the source tables. Then it performs your outer join; if any rows in the inner table have no matching rows in the outer table, based on the join conditions, then the columns of the outer table are all filled with the value "NULL". Next, your WHERE condition is applied. Here's the part where the query engine is being smart. Since your WHERE condition applies to a column of the outer table, none of those rows that represent non-matched inner rows, i.e. the rows that contain all NULLs, which were generated when the join could find no row in the outer table that matched the row in the inner table, none of those rows will be in the result set, because none of them will match the conditions of your WHERE clause. Consequently, the work that was done to create those rows was superfluous - it didn't need to happen. Thus, this query is identical to what you would get had you used an INNER JOIN; there was no reason to use the OUTER JOIN statement. The query optimizer recognizes queries like these, and rather than perform an OUTER JOIN, followed by an expensive FILTER operator that's filtering a bunch of extra rows, the optimizer replaces the OUTER JOIN with an INNER JOIN operation, since the results will be the same.

    Now clearly, this isn't what you intended, but conceptually, this is what you programmed when you placed a reference to the outer table in your WHERE clause. To properly express what you intended to do, you have to place the condition into your JOIN clause. There are exceptions to this rule. For example, one popular method of finding rows that don't have matching rows in some other table is to perform an OUTER JOIN on the two tables and then, in the WHERE clause, look for rows that have a value of NULL in the outer table, such as:

    
    
    SELECT a.*
    FROM a LEFT OUTER JOIN b
    ON a.id = b.id
    WHERE b.id IS NULL

    In this case, the optimizer will continue to use an outer join; it recognizes what you are trying to accomplish with such a statement, and it is possible for rows that were generated with null values as a result of the outer join could exist in the result set. This also illustrates the fact that -conceptually - we recognize that the conditions of the WHERE clause are met after the join has been performed. If you keep this rule in mind, it should help you to determine where to place a condition: in the JOIN or in the WHERE clause. Place it in the JOIN clause if you expect it to be met before the tables are joined, i.e. if you expect it to be a condition of the rows that are returned from the source table. Place it in the WHERE clause if you expect it to be met after the join has occurred, i.e. if you expect it to be true of the results that appear in the result set. Obviously, under most situations these two situations overlap, since the majority of queries are inner joins and in those cases it doesn't matter too much where you place the condition, but in outer joins, as has been seen, it CAN make a difference.

    Matthew Burr

    Matthew Burr

  • Thankyou Matthew for your input.It is very clear.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply