NOT IN condition

  • Hi all,

    We have a query that does the following:-

    INSERT INTO DC_CLAIMS

    (CLAIM_SK)

    SELECT

    DISTINCT CLAIM_SK

    FROM

    DIM_CLAIMS A2

    WHERE

    CLAIM_SK not in (SELECT CLAIM_SK from GL_CLAIMS A1 )

    This query does NOT return any rows to the DC_CLAIMS table. I expected some data to be inserted.

    When I modified the query as follows:-

    INSERT INTO DC_CLAIMS

    (CLAIM_SK)

    SELECT

    DISTINCT CLAIM_SK

    FROM

    DIM_CLAIMS A2

    WHERE

    not exists(SELECT 1 from GL_CLAIMS A1 WHERE A1.CLAIM_SK = A2.CLAIM_SK)

    I get rows inserted... Does anyone know why this is happening?

    Thanks

    Pete

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

  • I'm not sure, but I'll bet it's obvious. 😀

    Anywho, I believe this would be more efficient.

    INSERT INTO DC_CLAIMS (CLAIM_SK)

    SELECT DISTINCT

    CLAIM_SK

    FROM

    DIM_CLAIMS A2

    LEFT JOIN GL_CLAIMS A1

    ON A1.CLAIM_SK = A2.CLAIM_SK

    WHERE

    A1.CLAIM_SK IS NULL

    ______________________________________________________________________

    Personal Motto: Why push the envelope when you can just open it?

    If you follow the direction given HERE[/url] you'll likely increase the number and quality of responses you get to your question.

    Jason L. Selburg
  • Thanks for that Jason... But the query that runs doesn't take that long to run. What I was worried about was that the insert with the NOT IN clause was not returning any rows... The subquery should return 500,000 ish rows so I'm assuming that for some reason SQL Server has failed to store the subquery data in memory. The weird thing is it didn't return any errors when the query is run.

    Anyone else have any ideas why this is not running correctly?

    Pete

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

  • Your queries are running correctly, NOT IN and NOT EXISTS are not the same when there are NULLs involved and ANSI_NULLs are on. (Your data must have NULLs in the GL_CLAIMS.CLAIM_SK column.)

    Here is an article that tries to explain the three-value logic involved.

    http://weblogs.sqlteam.com/mladenp/archive/2007/05/18/60210.aspx

  • Thanks Ken, that explains it... Although after reading the article I still don't understand why one row in the table has nulls then the whole set of data is ignored.... Does Oracle behave in the same way?

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

  • Let's see if we can define it a little differently:

    Using an example syntax of ....

    X in (list)

    ...., the IN syntax compares X to each value and returns TRUE if any of the value equals. Now - if X is in fact in the list, then all is good, since there is at least one item in the list where X=item. If it's NOT there, and the list has no NULL values, then the IN returns FALSE. If the list has a NULL value, then the IN syntax returns a value of UNKNOWN (neither TRUE nor FALSE).

    Now - when you throw in the NOT, then your scenario returns NOT UNKNOWN (which interestingly is also UNKNOWN). For better or for worse, NOT UNKNOWN is not the same as TRUE, so the criteria fails.

    If you want the NOT IN syntax to work, simply make sure that the select doesn't return any NULL values, such as:

    INSERT INTO DC_CLAIMS

    (CLAIM_SK)

    SELECT

    DISTINCT CLAIM_SK

    FROM

    DIM_CLAIMS A2

    WHERE

    CLAIM_SK not in

    (SELECT CLAIM_SK from GL_CLAIMS A1

    where CLAIM_SK is not null --here's the update

    )

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • Thanks Matt.. I had a little google moment to see if anyone else had the issue in Oracle world, and came up with this web page which really does explain to me how it works...

    http://jonathanlewis.wordpress.com/2007/02/25/not-in/

    Thanks for your help, basically I'm going to avoid NOT in and use NOT EXISTS instead where possible.

    Cheers

    Pete

    Peter Gadsby
    Business Intelligence Consultant
    www.hgconsult.co.uk

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply