Performance trouble with 73 million records

  • Hi,

    I have a table with 73 million records and some performance troubles are being reported.

    My table has 4 fields as described:

    f1 as int; f2 as int; f3 as int; f4 as datetime

    Primary key: f1 + f3 + f4 clustered

    The query being executed is:

    SELECT f1, f2 FROM T WHERE f3 = <value> AND f4 = <date>

    What can I do to improve it???

    Create new independent indexes for f3 and f4??

    Best regards

    JB

  • What does your execution plan show?  I would guess that it is doing an index seek based off of your WHERE clause and that the delay is the lookup of f2.  SQL Server should be able to navigate through the clustered index key very quickly, but it then has to go out to the data pages and find the f2 values.  Check your execution plan and see where the bulk of the time (%) is spent.  You may benefit from a covering index .

    John Rowan

    ======================================================
    ======================================================
    Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

  • The problem is that the order of columns in the index does not support that query, and I would assume a scan is happening.

    If f1 is the 1st column in the index, but the WHERE does not filter on it, then the index is not going to be used for a seek.

    If this is a commonly executed query, with f3 and f4 used to filter, then you need an index whose 1st column is either f3 or f4. The column you select to be first should be the column that is the most selective.

  • The execution plan shows that 88% of the time is spent in the index seek.

    What should be the best index solution? One for each column (f3, f4), or just one with both (f3+f4)??

  • >>The execution plan shows that 88% of the time is spent in the index seek.

    Are you sure it says "seek" ? I would expect a clustered index scan

     

    Create Table #Test (

      f1 int not null,

      f2 int not null,

      f3 int not null,

      f4 datetime not null

    )

    Create unique clustered index #ixTest

      on #test (f1, f3, f4)

    Insert Into #Test

    Select 1, 2, 1, '08 Aug 2006' Union

    Select 2, 2, 2, '08 Aug 2006' Union

    Select 3, 2, 3, '08 Aug 2006' Union

    Select 4, 2, 4, '08 Aug 2006' Union

    Select 5, 2, 5, '08 Aug 2006'

    SELECT f1, f2 FROM #Test WHERE f3 = 1 AND f4 = cast('08 Aug 2006' As datetime)

    ^^ I get a clustered index scan on this small test.

    Create Index #ixTest2 on #Test (f3, f4)

    ^^ This non-clustered compsite index on f3,f4 gives me an index seek

     

  • Sorry PW, you're right it was a 'Cluster Index Scan' ( I'm rewinding it in my head I'm not at work now )

    I suppose the index seek would perform faster than the index scan, right?

  • Option : 1

    Considering this query alone,

    I recommend creating a clustered index on f3, f4 and unique non clustered index on f1, f2, f3.

     

    Create Table #Test1 (

      f1 int not null,

      f2 int not null,

      f3 int not null,

      f4 datetime not null

    )

     

    Create clustered index #ixTest1

      on #test1 (f3, f4)

     

     

    Create unique nonclustered index #ixTest2

      on #test1 (f1, f3, f4)

     

    Insert Into #Test1

    Select 1, 2, 1, '08 Aug 2006' Union

    Select 2, 2, 2, '08 Aug 2006' Union

    Select 3, 2, 3, '08 Aug 2006' Union

    Select 4, 2, 4, '08 Aug 2006' Union

    Select 5, 2, 5, '08 Aug 2006'

     

    SELECT f1, f2 FROM #Test1 WHERE f3 = 5 AND f4 = cast('08 Aug 2006' As datetime)

     

    Option :2

     

    If you use f1, f3, f4 in your search condition in most of the queries,

    Then have your code as it is and add another non clustered index as given below

     

    Create nonclustered index #ixTest1

     on #test (f3, f4)

    With Regards
    MeenakshiSundaram Lakshmanan

  • Ater creating the new non clustered index I got improvements when the query returns a small result set.

    After looking at the execution plan it was possible to determine that when it is a small result set it uses the new index (and does the index seek), but with larger result sets it still uses the clustered index (and does the index scan).

     

  • Maybe you can give it a hint to use the non cluster index and see if that keeps the improvement in a large result set.

    John

  • Joan,

    You have 2 possibilities:

    Either you drop your existing clustered index and recreate a new one on f3, f4

    or you create a covering index on f3, f4, f1, f2. This covering index should not ne cessarly be a clustered one, never the less for a table of this size you will need one.

    In both case the order of the columns in the index definition is very important.

    Also what you can check is the selectivity of the columns in the index.

    if the f4 (date) colum has more distinct values as the f3 (int), then you should put the f4 column in the first position of a clustered index. Other wise you can keep the f3, f4 order



    Bye
    Gabor

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply