Performance trouble with 73 million records

Question

Post reply

Performance trouble with 73 million records

Joao Baptista

SSC Enthusiast

Points: 136
More actions
August 8, 2006 at 1:06 pm

#113255

Hi,

I have a table with 73 million records and some performance troubles are being reported.

My table has 4 fields as described:

f1 as int; f2 as int; f3 as int; f4 as datetime

Primary key: f1 + f3 + f4 clustered

The query being executed is:

SELECT f1, f2 FROM T WHERE f3 = <value> AND f4 = <date>

What can I do to improve it???

Create new independent indexes for f3 and f4??

Best regards

JB

Viewing 10 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

John Rowan SSC Guru Points: 56440 More actions · Answer 1

What does your execution plan show? I would guess that it is doing an index seek based off of your WHERE clause and that the delay is the lookup of f2. SQL Server should be able to navigate through the clustered index key very quickly, but it then has to go out to the data pages and find the f2 values. Check your execution plan and see where the bulk of the time (%) is spent. You may benefit from a covering index .

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

PW-201837 SSC-Insane Points: 20805 More actions · Answer 2

The problem is that the order of columns in the index does not support that query, and I would assume a scan is happening.

If f1 is the 1st column in the index, but the WHERE does not filter on it, then the index is not going to be used for a seek.

If this is a commonly executed query, with f3 and f4 used to filter, then you need an index whose 1st column is either f3 or f4. The column you select to be first should be the column that is the most selective.

Joao Baptista SSC Enthusiast Points: 136 More actions · Answer 3

Joao Baptista

SSC Enthusiast

Points: 136

August 8, 2006 at 4:16 pm

#653944

The execution plan shows that 88% of the time is spent in the index seek.

What should be the best index solution? One for each column (f3, f4), or just one with both (f3+f4)??

PW-201837 SSC-Insane Points: 20805 More actions · Answer 4

>>The execution plan shows that 88% of the time is spent in the index seek.

Are you sure it says "seek" ? I would expect a clustered index scan

Create Table #Test (

f1 int not null,

f2 int not null,

f3 int not null,

f4 datetime not null

)

Create unique clustered index #ixTest

on #test (f1, f3, f4)

Insert Into #Test

Select 1, 2, 1, '08 Aug 2006' Union

Select 2, 2, 2, '08 Aug 2006' Union

Select 3, 2, 3, '08 Aug 2006' Union

Select 4, 2, 4, '08 Aug 2006' Union

Select 5, 2, 5, '08 Aug 2006'

SELECT f1, f2 FROM #Test WHERE f3 = 1 AND f4 = cast('08 Aug 2006' As datetime)

^^ I get a clustered index scan on this small test.

Create Index #ixTest2 on #Test (f3, f4)

^^ This non-clustered compsite index on f3,f4 gives me an index seek

Joao Baptista SSC Enthusiast Points: 136 More actions · Answer 5

Joao Baptista

SSC Enthusiast

Points: 136

August 8, 2006 at 5:27 pm

#653954

Sorry PW, you're right it was a 'Cluster Index Scan' ( I'm rewinding it in my head I'm not at work now )

I suppose the index seek would perform faster than the index scan, right?

MeenakshiSundaram Lakshmanan SSC Veteran Points: 204 More actions · Answer 6

Option : 1

Considering this query alone,

I recommend creating a clustered index on f3, f4 and unique non clustered index on f1, f2, f3.

Create Table #Test1 (

f1 int not null,

f2 int not null,

f3 int not null,

f4 datetime not null

)

Create clustered index #ixTest1

on #test1 (f3, f4)

Create unique nonclustered index #ixTest2

on #test1 (f1, f3, f4)

Insert Into #Test1

Select 1, 2, 1, '08 Aug 2006' Union

Select 2, 2, 2, '08 Aug 2006' Union

Select 3, 2, 3, '08 Aug 2006' Union

Select 4, 2, 4, '08 Aug 2006' Union

Select 5, 2, 5, '08 Aug 2006'

SELECT f1, f2 FROM #Test1 WHERE f3 = 5 AND f4 = cast('08 Aug 2006' As datetime)

Option :2

If you use f1, f3, f4 in your search condition in most of the queries,

Then have your code as it is and add another non clustered index as given below

Create nonclustered index #ixTest1

on #test (f3, f4)

With Regards
MeenakshiSundaram Lakshmanan

Joao Baptista SSC Enthusiast Points: 136 More actions · Answer 7

Ater creating the new non clustered index I got improvements when the query returns a small result set.

After looking at the execution plan it was possible to determine that when it is a small result set it uses the new index (and does the index seek), but with larger result sets it still uses the clustered index (and does the index scan).

John Hanrahan Hall of Fame Points: 3825 More actions · Answer 8

John Hanrahan

Hall of Fame

Points: 3825

August 9, 2006 at 4:43 pm

#654186

Maybe you can give it a hint to use the non cluster index and see if that keeps the improvement in a large result set.

John

Gabor Nyul SSCrazy Eights Points: 9459 More actions · Answer 9

Joan,

You have 2 possibilities:

Either you drop your existing clustered index and recreate a new one on f3, f4

or you create a covering index on f3, f4, f1, f2. This covering index should not ne cessarly be a clustered one, never the less for a table of this size you will need one.

In both case the order of the columns in the index definition is very important.

Also what you can check is the selectivity of the columns in the index.

if the f4 (date) colum has more distinct values as the f3 (int), then you should put the f4 column in the first position of a clustered index. Other wise you can keep the f3, f4 order

Bye
Gabor