Optimizer not using our indexes

  • Description of problem.

    We have in the Database table ConsumerListing with a number of

    searchable

    columns. We can not predict the combination of search conditions.

    Because of

    that we chose to create a number of one-column non-clustered indexes.

     

    Two types of searches are being performed on the table.

    Type 1: we need to access all records which satisfy certain conditions.

    In

    our example people who Dogs and interested in Gardening in the household.

    Type 2 we need to access records which satisfy certain conditions and

    have

    certain phone numbers. To achieve that we decided in addition to

    one-column

    non-clustered indexes to create indexes with keys AreaCode, Phone and

    add

    all the other search conditions on the leaf level of non clustered index

    (included columns new for SQL Server 2005).

     

    We decided to test query

    select count(*) from ConsumerListing

    where

    Dogs='Y' AND Gardner='Y'

    It took long. Query optimizer issued following execution plan:

    StmtText

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    -----------------------------------------------

      |--Compute

    Scalar(DEFINE[Expr1004]=CONVERT_IMPLICIT(int,[globalagg1008],0)))

           |--Stream

    Aggregate(DEFINE[globalagg1008]=SUM([partialagg1007])))

                |--Parallelism(Gather Streams)

                     |--Stream Aggregate(DEFINE[partialagg1007]=Count(*)))

                          |--Index

    Scan(OBJECT[DD_Data].[dbo].[ConsumerListing].[Ind_Phone]),

    WHERE[DD_Data].[dbo].[ConsumerListing].[Dogs]='Y' AND

    [DD_Data].[dbo].[ConsumerListing].[Gardner]='Y'))

    Meaning instead of seeking indexes on Dogs and Cats and merging results

    (Behavior we expected) optimaizer choose to scan

    Ind_Phone.

    We decided to overwrite this with index hints

    select count(*) from ConsumerListing

    with (index (IND_Dogs, Ind_Gardner))

    where

    Dogs='Y' AND Gardner='Y'

    We get expected Behavior

    StmtText

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    ------------------------------------------------------------------------

    ----

    -----------------------------------------------

      |--Compute

    Scalar(DEFINE[Expr1004]=CONVERT_IMPLICIT(int,[Expr1007],0)))

           |--Stream Aggregate(DEFINE[Expr1007]=Count(*)))

                |--Merge Join(Inner Join,

    MERGE[DD_Data].[dbo].[ConsumerListing].[AddressID],

    [DD_Data].[dbo].[ConsumerListing].[IndividualId],

    [Uniq1002])=([DD_Data].[dbo].[ConsumerListing].[AddressID],

    [DD_Data].[dbo].[ConsumerListing].[IndividualId], [Uniq1002]),

    RESIDUAL[DD_Data].[dbo].[ConsumerListing].[AddressID] =

    [DD_Data].[dbo].[ConsumerListing].[AddressID] AND

    [DD_Data].[dbo].[ConsumerListing].[IndividualId] =

    [DD_Data].[dbo].[ConsumerListing].[IndividualId] AND [Uniq1002] =

    [Uniq1002]))

                     |--Index

    Seek(OBJECT[DD_Data].[dbo].[ConsumerListing].[Ind_Dogs]),

    SEEK[DD_Data].[dbo].[ConsumerListing].[Dogs]='Y') ORDERED FORWARD)

                     |--Index

    Seek(OBJECT[DD_Data].[dbo].[ConsumerListing].[Ind_Gardner]),

    SEEK[DD_Data].[dbo].[ConsumerListing].[Gardner]='Y') ORDERED FORWARD)

    We decided to test both situations with

     

    DBCC DROPCLEANBUFFERS

    DBCC FREEPROCCACHE

    SET STATISTICS IO ON

    SET STATISTICS TIME ON

    Resulsts for SET STATISTICS TIME ON SQL Server Execution Times:

    CPU time = 36266 ms,  elapsed time = 396475 ms.--default

    CPU time = 19032 ms,  elapsed time = 19245 ms.--with index hints

    In my oppinion wiht index hints times match better

    Results from STATISTICS TIME ON

    Table 'ConsumerListing'. Scan count 5, logical reads 3948831, physical

    reads

    7, read-ahead reads 3939681, lob logical reads 0, lob physical reads 0,

    lob

    read-ahead reads 0.--default

    Table 'ConsumerListing'. Scan count 2, logical reads 88589, physical

    reads

    6, read-ahead reads 88581, lob logical reads 0, lob physical reads 0,

    lob

    read-ahead reads 0.----with index hints

     

    Again all results are better when we're using two one column indexes.

    All our one column indexes has low selectivity

    Partial formatted results from DBCC SHOW_STATISTICS bellow:

    Dogs

     

     

     

     

    RANGE_HI_KEY

    RANGE_ROWS

    EQ_ROWS

    DISTINCT_RANGE_ROWS

    AVG_RANGE_ROWS

    U

    0

    169,693,200

    0

    1

    Y

    0

    34,562,520

    0

    1

    Gardener

     

     

     

     

    RANGE_HI_KEY

    RANGE_ROWS

    EQ_ROWS

    DISTINCT_RANGE_ROWS

    AVG_RANGE_ROWS

    U

    0

    159,358,900

    0

    1

    Y

    0

    44,896,820

    0

    1

     

    Table has 204,255,720

    Please note that regarding of low selectivity if you will run

    select count(*) from ConsumerListing

    where

    Dogs='Y'

    (Or anything with only one condition) optimizer will use ind_dogs (seek)(or any other appropriate index.

     

    The question is why optimizer is using such an ineffective execution

    plan?

    What can we do to avoid index hints( will be difficult to program)?

    Help will be greatly appreciated.

  • 2 options come to mind at the moment :

    1 - sp_updatestats

    2 - Create a table variable with the key column(s) of the table.

    Insert into @TV (KeyCol) Select KeyCol FROM dbo.YourTable WHERE SomeCol = @SomeVar

    DELETE @TV TV WHERE NOT Exists (Select * FROM dbo.YourTable YT WHERE SecondCol = @SecondParam AND TV.KeyCol = YT.KeyCol)

    DELETE...

    DELETE...

    DELETE...

    SELECT Needed, Columns FROM dbo.YourTable YT INNER JOIN @TV TV ON YT.KeyCol = TV.KeyCol

     

    I have never tried this server side, but I got good results client side on a search engine with a monstruous amount of search arguments variations (200+).  You might also add a condition after each delete to end the execution of the code (may be faster than deleting 0 rows with exists but I have never tested for that).

     

    This technic also gives you the chance to put the most selective search options at the top of the list so that each successive search becomes much less expansive.

  • 1.Statistics is up to date I checked it. This is static table

    2. Our table is 200 Million rows I do not think this is going to work for us

    Thank you for your replay

  • 1 - Can more than one parameter be search at the same time, if yes what is the limit?

    2 - How many parameters are passed to this procedure?

    3 - Which are optional?

    4 - Do you need paging of the results?

    5 - How do you plan to avoid an index scan with that number of parameters?

     

    Can we see the table DDL (with indexes and keys)?

  • This article may be of particular interest to you :

    http://www.sommarskog.se/dyn-search.html

  • The situation is quite easy.

    where Dogs='Y' AND Gardner='Y'

    That kind of where clause will never use any index as the selectivity is very bad (there can be only Y and N values in. It is much faster NOT to use any index access but go directly to the table and go trough the whole table. An index hint could be even worse in term of response time

    If you want to use any index chose a more selective condition like phone number or date or a covering index (in this case the select * is not suitable) and make it clustered.

    But you have to carefull by making an index clustered. You have to know how the users are accessing your table (ie what is the user behavior in terms of insert, udpate, select...)



    Bye
    Gabor

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply