Confusing query plan

  • Hi all,

    This is a cross-post of sorts. I've posted this to microsoft.public.sqlserver.server, but I suspect I have a different audience here, so posting here as well.

    I have an optimizer question. Ever since upgrading to SP3 (SQL2000 Ent Ed)

    it seems like we've had to start writing in more and more query hints into

    our stored procs so that SQL uses the "appropriate" indexes on our tables.

    After debugging through several performance problems, I've been noticing a

    pattern that many of them are caused by the query doing a table scan (or

    clustered index scan) of some large table.

    I know the problem could result from all kinds of things, so I spent some

    time yesterday eliminating as much as I could into the simple script below.

    What it does is create an indexed table with 10000 rows, then queries

    against the data. The first set of queries use variables (in our real code

    these would likely be parameters passed to the proc), and the second set

    uses hardcoded values. In both cases the values are the same, and in all

    cases the queries return no rows. What I don't understand is why SQL does a

    table scan (and 28 reads) on the queries using variables, but an index seek

    (2 reads) using the hardcoded values.

    The real-world query I started with yesterday was searching a large table by

    date (indexed on the date column), but the same thing happens even with a

    simple int column. I haven't tried it on a char column. Even though the date

    range in our live query was giving <2% selectivity, it still did a table

    scan. In my script below, I purposely tried 0% selectivity but no luck there

    either.

    One thing I found that makes a difference is table size -- if the table

    contains only 1000 rows instead of 10000, all queries do an index lookup.

    (seems kind of counterintuitive as well, queries on larger tables should be

    more likely to use the indexes, not less likely <g>) In fact the cutoff is

    7340 rows ... index lookup with 7340, table scan with 7341. Bizarre.

    Our workarounds so far have been to either add query hint to force usage of

    the index (which I generally hate to do) or turn the query into dynamic SQL

    so that the values are "hardcoded", e.g. exec('select * from #Testing where

    RowID > ' + cast(@RowID as varchar(5))) << I know this doesn't work but it

    gets the point across 🙂

    If anyone out there can offer suggestions, please do!

    Thanks

    Randy Rabin

    ChannelAdvisor Corp.

    ----------------------------------------------------------------------------

    --------------------------------

    set nocount on

    create table #Testing (RowID int identity(1,1), CurrentTime datetime)

    go

    declare @loop int

    set @loop = 10000

    while @loop > 0 begin

    insert #Testing (CurrentTime) values(getdate())

    set @loop = @loop - 1

    end

    go

    create index IX_Testing_RowID on #Testing (RowID)

    create index IX_Testing_CurTime on #Testing (CurrentTime)

    go

    declare @t datetime, @rid int

    select @t = '1/1/2004', @rid = 99999

    select * from #Testing where CurrentTime > @t -- does a table scan

    select * from #Testing where RowID > @rid -- likewise

    select * from #Testing where CurrentTime > '1/1/2004' -- uses the index

    select * from #Testing where RowID > 99999 -- likewise

  • Without researching it too much... I can tell that SQL2K loves having a clustered index. On my test machine I had the same table scan results as you, but creating either of the indexes as clustered resulted in both the table scans changing to index seeks, with no hints required.


    Cheers,
    - Mark

  • If the table is going to take up more than a couple of pages, use a clustered index. Otherwise you can get a 'heap' where the query hops all over the place and end up with as many IO operations as there are rows.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply