the opposite of join?

  • If I use an INNER JOIN to gather records from table A that have corresponding records in table B, what would I use to get records in table A that DON'T have corresponding records in table B?

    --Paul

  • Have you tried using a LEFT or RIGHT JOIN with a WHERE eliminating all the values where the two table join.  Your resultset would be something like this

    TABLEA TABLEB
    Name Name
    FOO NULL
    BOO NULL
    ABC NULL

    You ANSI SQL would be something like the following:

    SELECT TABLEA.name,TABLEBName

    FROM tablea LEFT JOIN tableb

     ON tablea.name = tableb.name

    WHERE tableb IS NULL

    You'll need to play around with the SQL

    Thanks,

  • For the same result you could try this:

    select *

    from tableA

    where not exists (Select Field1

                           from  TableB

                           where Field1=TableA.Field1)

  • You could use "not exists", until you compare query plans and execution time, and then you'd go back to using an OUTER JOIN...

  • Do you have anything specific to back that up with? For instance, take a look at this old thread where the same thing was discussed:

    Minus in SQL Server

  • From personal experience, outer joins mostly give better performance.

    I can't recall where it has been worse - the same, yes, but not worse.

    Indexing, output columns (from one table or both), overall and relative row counts, other filters in the where clause, and other factors all affect the query plan and hence performance.

    SQL Server 2000 unleashed, page 626 for an alternative view.

     

  • Paul,

    An INNER JOIN is always left to right. So the opposite of:

    FROM TableA INNER JOIN TableB

    is:

    FROM TableB INNER JOIN TableA.

    -SQLBill

  • I am still not sure. But anyway, the only correct thing to do is to try a few different alternatives and choose the one that performs the best.

    I do not have SQL Server 2000 Unleashed so I cannot check what it says there.

  • Uhm, I would say those two are equivalent. Are you saying they would produce different results?

  • I have never observed a LEFT JOIN checking for NULL behave any differently than a NOT EXISTS in terms of query plans.

    What I have observed are some nasty insidious bugs caused by using LEFT JOIN in code that is later modified.

    All it takes is someone to modify the code at a later date to add an additional filter to the LEFT JOIN'ed table and all of a sudden you have SQL server doing an implicit INNER JOIN behind your back and changing the resultset.

    eg:

    SELECT a.name, b.name

    FROM tablea  As a

    LEFT JOIN tableb As b

       ON a.name = b.name

    WHERE b.name IS NULL

    -- Joe the new hire who doesn't know SQL joins very well is asked

    -- to modify the code to add an additional check on tableb

    AND     b.SomeOtherColumn = 'SomeFilter' -- Bug!!

    If the requirement is to Select some rows where they don't exist somewhere else, using the NOT EXISTS makes the code more readable, makes the code's intention more easily understood and prevents bugs like above from creeping in.

     

     

     

  • Not if it's written as:

    SELECT a.name, b.name

    FROM tablea As a

    LEFT JOIN tableb As b

    ON a.name = b.name

    WHERE b.name IS NULL

    -- Joe the new hire who doesn't know SQL joins very well is asked

    -- to modify the code to add an additional check on tableb

    AND ( b.SomeOtherColumn = 'SomeFilter' or b.SomeOthercolumn is null)

    You wrote:

    If the requirement is to Select some rows where they don't exist somewhere else, using the NOT EXISTS makes the code more readable, makes the code's intention more easily understood and prevents bugs like above from creeping in.

    It *can* make the query more readable, but not if it's buried within a bunch of other stuff like other subqueries, if the correlated subquery is a PITA to write, any parts of either the outer or subquery change dramatically, etc...

    I did have a query in Oracle that used Outer Joins that was a dog (took a couple of minutes to run), that when the outer join was replaced with a NOT EXISTS, ran in about 2 seconds. The app's developers were happy...

  • Thanks for proving my point. In the example below, you get the wrong resultset using a LEFT JOIN and a filter in the WHERE, even if you do phrase it as "AND ( b.SomeOtherColumn = 'SomeFilter' or b.SomeOthercolumn is null)" :

    Declare @TableA Table

    (

      KeyColumn int not null,

      SomeData varchar(50)

    )

    Declare @TableB Table

    (

      KeyColumn int not null,

      SomeCode char(1) not null

    )

    Insert into @TableA

    Select 1, 'row 1' union

    Select 2, 'row 2' union

    Select 3, 'row 3' union

    Select 4, 'row 4'

    Insert into @TableB

    Select 1, 'A' union

    Select 1, 'B' union

    Select 2, 'A' union

    Select 2, 'C' union

    Select 3, 'B' union

    Select 4, 'D'

    -- Requirements: Show me all rows in TableA, where there is

    -- no row in TableB with a code of B

    --

    -- Expected result: Rows 2 and 4

    SELECT a.KeyColumn

    FROM @tablea As a

    LEFT JOIN @tableb As b

      ON a.KeyColumn = b.KeyColumn

    WHERE b.KeyColumn IS NULL

    AND ( b.SomeCode = 'B'  or b.SomeCode is null ) -- BUG!!

    SELECT a.KeyColumn

    FROM @tablea As a

    WHERE NOT EXISTS (

      SELECT * FROM @TableB As b

      WHERE a.KeyColumn = b.KeyColumn

      AND b.SomeCode = 'B'

    )

  • SELECT a.KeyColumn

    FROM @tablea As a

    LEFT JOIN (SELECT * FROM @tableb WHERE b.SomeColumn = 'B') b

    ON a.KeyColumn = b.KeyColumn OR b.KeyColumn IS NULL

    Use this (I have no SQL box to test on).

    This is a known feature of outer joins.

    http://support.microsoft.com/kb/176480/en-us

    Using the derived table reduces the scope for error.

    IIRC (from an Itzak Ben Gan presentation and elsewhere), MSSQL may apply the WHERE or JOIN in the order it thinks best which is why this kind of thing happens..

    Also, see http://www.sql-server-performance.com/tuning_joins.asp and search for "If you have the choice of using a JOIN or a subquery"

    PW said "I have never observed a LEFT JOIN checking for NULL behave any differently than a NOT EXISTS in terms of query plans" - I have. What I haven't seen is it where it is *worse*... a few rows in the subquery = OK, but with 1000s on both sides, it can change.

    Try the previous post code with 1000s of rows in each and test the NOT EXISTS and my re-written code above. Is it worse? Is it better?

  • >>SELECT a.KeyColumn

    >>FROM @tablea As a

    >>LEFT JOIN (SELECT * FROM @tableb WHERE b.SomeColumn = 'B') b

    >>ON a.KeyColumn = b.KeyColumn OR b.KeyColumn IS NULL

    That returns all four rows. Which is incorrect.

     

  • Good point. I also agree that using NOT EXISTS often makes the queries more readable. Furthermore, I think the most readable query in this example is one using the new EXCEPT keyword from SQL Server 2005. And as I wrote in the other thread I referred to, that query uses the exact same execution plan as the one using NOT EXISTS.

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply