SQL2000 - Very long query using IN does not generate the same query plan.

  • Hello,

    I have a query using the IN clause with more than 100 items. If this query contain more than 140 items the query plan generated by SQL Server 2000 SP3 is different.

    Another server using the same DB, I can run a query with 250 items in the clause IN and the query plan is the same.

    Does anybody know why?

    Regards,

      Richard.

  • These are hard to track down. The query plan is built based on conditions on the server at the time it's compiled. Have you checked to be sure the indexes are the same. Is the hardware different (substantially).

    What are the loads like on the server?

  • Hi Steve,

    I don't know about the hardware, but I'm really pretty convinced that it is different. Could it be a potential reason?

    As soon as I received the DB Backup, I will be able to give you more details.

    Another strange behavior, the customer told me if ANSI NULL is ON (the default on this DB was OFF) seems to solve this issue.

    I don't see any correlation between different query plan and ANSI NULL settings?

     

    Thanks for your help.

  • NULLS in the IN list can have strange effects.  If you're using "WHERE a IN (SELECT x FROM ...)", and [x] is a nullable field, its a good idea to include "WHERE x IS NOT NULL" in the subquery.

    "WHERE a IN (b,c,d)" is equivalent to "WHERE a=b OR a=c OR a=d".  If b is null, and a is not c or d, "WHERE a IN (b,c,d)" should be false.  But "WHERE a NOT IN (b,c,d)" will also be false.

  • I remember reading somewhere that long IN lists are not efficient, they should be changed to joins.  I don't remember a specific definition of "long", but you might consider it.  I think the idea was that a join can be done in parallel more efficiently than an IN filter.

  • Long INs will tend to build a temporary table in memory and perform a join themselves behind the scenes but that threashold of exactly when it will happen can vary from run to run and machine to machine. If it doesn't then each is treated as a constant value and compared with the column. It sometimes helps to purposely put into a temp table with a specific index and join to that. Might even be able to do better by putting thru a UDF to return a table, but you wouldn't want to do with relatively small lists (which again can vary in size based on hardware) as they wouldn't see a performance increase and quite possibly will see a decrease.

  • The discrepancy could also be caused by differing index statistics between the servers.

    So long, and thanks for all the fish,

    Russell Shilling, MCDBA, MCSA 2K3, MCSE 2K3

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply