Where Clause

Question

Post reply

Where Clause

a1d0_d3v111a

Newbie

Points: 9
More actions
October 25, 2002 at 3:30 am

#101285

I have two tables in my database. The first table has rows and the second

table is empty.

I want to return all rows from t1 where the t1.id is not existing from t2,

when I try this statement

SELECT t1.* FROM t1, t2 Where t1.id <> t2.id no result was returned to me. I

know that I could use

SELECT t1.* FROM t1 WHERE t1.id NOT EXIST (SELECT t2.id FROM t2) but i don't

like to use

the NOT EXIST clause. Is there any other way in doing this?

aldo

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

paulhumphris Hall of Fame Points: 3496 More actions · Answer 1

paulhumphris

Hall of Fame

Points: 3496

October 25, 2002 at 3:37 am

#439434

Instead of the EXISTS clause, have you considered the IN clause.

SELECT t1.*

FROM t1

WHERE t1.id not in(SELECT t2.id FROM t2)

Antares686 SSC Guru Points: 125444 More actions · Answer 2

Another way which is set based is to use a LEFT JOIN

SELECT t1.*

FROM

t1

LEFT JOIN

t2

WHERE t1.id = t2.id

WHERE

t2.id IS NULL

With the left join when there is no match the output will generate a NULL.

This also has a better performance since only one data read occurrs to t2 as opposed to an NOT IN or NOT EXISTS subquery which does for each row output by T1.

"Don't roll your eyes at me. I will tape them in place." (Teacher on Boston Public)

Nicholas Wang Default port Points: 1463 More actions · Answer 3

This method is great! LEFT OUTER JOIN, when there's no join on the right side, NULL is returned... no need to use <> or NOT.

Slight mistake on the Left Join though...

SELECT t1.*

FROM t1

LEFT OUTER JOIN t2

ON t1.id = t2.id -- ON instead of WHERE

WHERE t2.id IS NULL

BTW aldo, this is the new syntax, u can also write it the old way:

SELECT t1.*

FROM t1,t2

WHERE t1.id *= t2.id

AND t2.id IS NULL

mdburr Default port Points: 1483 More actions · Answer 4

Just a comment on the NOT EXISTS clause. You might want to reevaluate how SQL 2000 optimizes this clause; it is much more intelligent and can understand what you are attempting to do with this clause. Often times, it will generate query plans that are identical to the plans for other approaches that are usually considered more economical.

Additionally, you need to be careful with the LEFT OUTER JOIN approach. Although it is a very popular approach, you need to make sure that you check for NULL values in the correct columns, which the examples in this thread illustrate; namely, you need to check for a NULL value in one of the JOIN columns of the outer table. If you check for a NULL value in a column that is not one of the columns used for joining the tables, you may experience errant results; so, be cautious on that matter.

Finally, you may experience better results using the NOT EXISTS statement. Consider this:

With NOT EXISTS, the query optimizer can throw out records that don't meet the specified criteria as it's doing a join between the two tables. For example, the query optimizer will likely use a NESTED LOOP JOIN operation, first selecting from the inner table, then stepping through each row and using (hopefully - if you've done your indexing correctly) an INDEX SEEK to try and select matching rows from the OUTER TABLE. If it finds a matching row, then it knows that the row from the INNER TABLE does not meet the NOT EXISTS criteria and it can immediately remove that row from the result set. Once it has gone through this join, it has found all of the correct results and can return them immediately (unless you've added additional conditions that need to be filtered, aggregations, ordering, or other query operations).

Now consider the LEFT OUTER JOIN approach. It too will select data from the INNER TABLE, and then it too could very likely use a NESTED LOOP JOIN, just like the query using NOT EXISTS. In this case, though, it will keep all of the data from the INNER TABLE, and join all matching rows of the OUTER TABLE. Then, after it has completed the JOIN, it will go through an additional step, working on that result set, and filtering the results based on the criteria that a specific column is NULL. So, there is an additional step involved. Consequently, this may yield slower results than using NOT EXISTS.

Here's an example that - at least on my SQL 2000 server - illustrates this. Look at the query plans for the two SELECT queries at the end:



CREATE TABLE Test1

(a int not null)



CREATE TABLE Test2

(a int not null)



INSERT INTO Test1 VALUES (1)

INSERT INTO Test1 VALUES (2)

INSERT INTO Test1 VALUES (3)

INSERT INTO Test1 VALUES (4)



INSERT INTO Test2 VALUES (1)

INSERT INTO Test2 VALUES (2)

INSERT INTO Test2 VALUES (3)



CREATE INDEX Test1_idx ON Test1 (a)

CREATE INDEX Test2_idx ON Test2 (a)



SELECT *

  FROM Test1

 WHERE NOT EXISTS (SELECT *

			FROM Test2

		      WHERE Test1.a = Test2.a)



SELECT Test1.*

  FROM Test1

  LEFT OUTER JOIN

	Test2

    ON Test1.a = Test2.a

 WHERE Test2.a IS NULL

Although, with this small amount of data, the performance difference is negligible (a side-by-side evaluation of the query plans shows that they both take the same amount of time), with a large amount of data, that additional FILTER operation could contribute substantially to execution time and CPU utilization.

Matthew Burr

Simon Sabin SSCrazy Eights Points: 8142 More actions · Answer 5

Further more if the t2 can have more than one row for the an id value then you have to use the not exists method. This is an area where you can't say do it one way or another, because it hugely depends on the data involved, the indexing.

Simon Sabin

Co-author of SQL Server 2000 XML Distilled

http://www.amazon.co.uk/exec/obidos/ASIN/1904347088

Simon Sabin
SQL Server MVP

http://sqlblogcasts.com/blogs/simons