Query Ignores Clustered Index

  • I have a table, let's call it tblData, with some 100M records spread over 52 calendar weeks. The table has ~60 columns of many different types (not my design so... yeah).

    The PK (and clustered index) on tblData is [CalendarWeek smalldatetime, AKey bigint], where AKey is an identity field and is unique for a given CalendarWeek.

    I'm trying to delete the oldest CalendarWeek's data from tblData. Normally this table has ~5 nonclustered indexes, and I DROP all of them except for one (IX_tblData_CustomerKey_CalendarWeek) before issuing the DELETE because, without that one index still in place, the table is basically useless to our users during the upcoming delete and index rebuild. Here's what this looks like:

    ---------------------

    DECLARE @OldestCalendarWeek smalldatetime

    SET @OldestCalendarWeek = [i]Quickly find the oldest CalendarWeek in tblData to delete[/i]

    [i]Drop all the nonclustered indexes except for IX_tblData_CustomerKey_CalendarWeek[/i]

    DELETE tblData

    WHERE CalendarWeek <= @OldestCalendarWeek

    [i]Rebuild the dropped nonclustered indexes[/i]

    ---------------------

    Now, since the CI is on [CalendarWeek, AKey] and because the DELETE filters only on CalendarWeek, I expect this DELETE to just index seek to that one week and blow right through the one week of data.

    But in fact, what it does is a full scan on the one nonclustered index I left in place: IX_CustomerKey_CalendarWeek. And I don't understand why it would choose that plan; it takes a whole lot longer that way.

    Rewriting the statement as follows resolves this:

    DELETE tblData

    FROM tblData

    WITH (INDEX (PK_tblData))

    WHERE CalendarWeek <= @OldestCalendarWeek

    My question is, simply, Why? Why on earth would SQLS decide to scan the nonclustered index when my delete is filtered on the first field of the CI and the CI seek is many times faster? And is there a way to solve this without resorting to the WITH INDEX hint?

    It does seem to be related to the "<=" thing, even though it only ever deletes one week of data. I could code it as simply "=" (and in that case it does choose the CI on its own), but for safety I prefer "<=".

    Thanks!

  • Have you tried this same query with hard-coding the value and not using a variable? In that case, is it still doing a Non-clustered Index Scan? Also, are the statistics updated?

  • I updated the statistics just to be sure, and it didn't change anything.

    It does use the CI if I hardcode the date instead of using a variable. But, in practice, I can't do this (unless I create the statement dynamically, which I'm not interested in).

    I guess if there are no other suggestions I'll just use the index hint.

  • Steve F. (5/26/2010)


    I updated the statistics just to be sure, and it didn't change anything.

    It does use the CI if I hardcode the date instead of using a variable. But, in practice, I can't do this (unless I create the statement dynamically, which I'm not interested in).

    I guess if there are no other suggestions I'll just use the index hint.

    If you show the estimated query plan for the variable based query, you will probably see a low Estimated Rowcount, thus the NC index seek plan.

    You really do need to use dynamic SQL here to get the proper row count estimate and the optimal plan. You could also try forcing the clustered index in the plan.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Well even if you only delete 1 week of data that equates to roughly 2% of the data in the table. With that much data, a scan will always be faster that seek + lookups.

    BTW the tipping point is under 1%. I'm sure it always depends on the data in the table but that gives you a ballpark figure.

    Now 1 thing I recommend you test is to simply delete the data without dropping and rebuilding the indexes.

    After the delete check how much fragmentation was caused by the delete and then decide if it's worth doing a rebuild or simply reorganize... maybe you can even do all that off hours and the users wouldn't know a thing.

    In either case if you don't drop all the indexes, then the table can remain online while you delete. You can also delete in batches of 250k - 500K rows so that you don't lock the tables for too long (you need to figure out the tipping point in # of rows between locking the table for a few seconds to a few minutes or even hours).

  • Ninja's_RGR'us (5/27/2010)


    Well even if you only delete 1 week of data that equates to roughly 2% of the data in the table. With that much data, a scan will always be faster that seek + lookups.

    BTW the tipping point is under 1%. I'm sure it always depends on the data in the table but that gives you a ballpark figure.

    Now 1 thing I recommend you test is to simply delete the data without dropping and rebuilding the indexes.

    After the delete check how much fragmentation was caused by the delete and then decide if it's worth doing a rebuild or simply reorganize... maybe you can even do all that off hours and the users wouldn't know a thing.

    In either case if you don't drop all the indexes, then the table can remain online while you delete. You can also delete in batches of 250k - 500K rows so that you don't lock the tables for too long (you need to figure out the tipping point in # of rows between locking the table for a few seconds to a few minutes or even hours).

    Yeah but I'm talking about a clustered index seek vs. a nonclustered index scan. With the CI seek there is no lookup.

    Ultimately, here's the problem. I'm trying to avoid filling up the logs and uber-locking the table, so I am using a DELETE inside a WHILE loop and restricting the deletion to batches using SET ROWCOUNT. SQLS doesn't seem to recognize that the batches are small enough that the clustered index seek should be MUCH more efficient in those tiny batches than a nonclustered index scan. I've done enough of this kind of work on this table to have a rough feel for how long these operations should take, and it's taking long enough that it seems to me that it's doing the full NC index scan every time it deletes a batch, which is absurd, instead of the CI seek which should be instant.

    I guess another thing I could try (that we'll eventually have to go to anyway) is "DELETE ... WHERE Id IN (SELECT TOP 10000 ... WHERE CalendarWeek = @OldestCalendarWeek)". Maybe it would do better with that instead of SET ROWCOUNT.

    I guess for now I'm going to use the index hint.

    Thanks!

  • Have you tried something along those lines?

    Select identity(int,1,1) as DeleteID, WeekNo, Id into #tmp FROM DeletingTbl where WeekNo <= @Whatever

    set rowcount 50000

    set @Loop = 0

    while @@Rowcount <> 0

    begin

    SET @Loop = @Loop + 1

    Delete DeletingTbl from DeletingTbl INNER JON #tmp on clustered index match where #tmp.DeleteId <= @Loop * 50000

    DELETE FROM #tmp where #tmp.DeleteId <= @Loop * 50000

    end

    also if the rowid for the week is sequential you could use that column as well for the delete rather than adding a new one. That way you could skip the select into entirely. Also you could find the tiping point for the batch size and always have clustered seeks for the deletes.

  • -

  • Yes, the row ID is sequential within the week, so I could definitely try that. I'd like to think it would then have no choice other than to use the CI seek, but I think the problem is that it's not optimizing this correctly because of the variable @OldestCalendarWeek. It seems like the query plan does not take the value of that variable into account (and this may be by design). It seems like the plan is assuming my DELETE might be hitting multiple CalendarWeeks, which, in practice, it never does. I've only written the query using "<=" because it seems safer than simply "=". Using "=" is another variation that solves this but makes me a little less comfortable in terms of code robustness.

    The other idea with the idea table is also a good one. I'll check it out.

    Thanks again for your help!

  • If this is a SP, then you can definitly try this.

    DECLARE @LocalWeek INT

    SET @LocalWeek = @ParamValue

    The use @LocalWeek in the query that will help the optimiser.

    Also add the with recompile option to the SP.

    Those 2 things should take care of the parameter sniffing and safe plan issues (the optimiser has to assume that the number may change, hence a scan is a safer plan than a seek, but not in your case).

    One last thing that could force the CI seek is to add this "dummy" condition.

    WHERE ... AND WeekUniqueID < 3 999 999 999 --make sure this number is high enough so you never get any problems with the code.

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply