Performance Issue while retrieving data from Very Large Table

Question

Post reply

Performance Issue while retrieving data from Very Large Table

Amit Tiwari-332074

SSC Enthusiast

Points: 156
More actions
March 14, 2008 at 6:30 am

#184962

Hi Everybody,

I have a table having Five Columns

CREATE TABLE TABLE_X (

Col1 INT NOT NULL,

Col2 varchar(10) NOT NULL,

Colc Datetime NOT NULL

)

PRIMARY KEY CLUSTERED IS ON COL1 & COL2.

THERE is a seperate index on Col1 also which is non clustered and non unique.

THERE are about 2.5 Billion Rows and i have to select on the basis of

ColC i.e.

SELECT * FROM TABLE_X WHERE

Colc between '2007-12-01 00:00:00' and '2008-02-29 23:59:59'

The total number of rows which will be retrieved using this query will be about 20 million.

In the Execution plan i see that a clustered index scan is performed.

The time duration which is being taken by this query is around 9 minutes which is not at all acceptable.

Please guide me how to do the tuning of this query.

Thanks

Amit

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply

Grant Fritchey SSC Guru Points: 397721 More actions · Answer 1

Amit Tiwari (3/14/2008)

Hi Everybody,

I have a table having Five Columns

CREATE TABLE TABLE_X (

Col1 INT NOT NULL,

Col2 varchar(10) NOT NULL,

Colc Datetime NOT NULL

)

PRIMARY KEY CLUSTERED IS ON COL1 & COL2.

THERE is a seperate index on Col1 also which is non clustered and non unique.

THERE are about 2.5 Billion Rows and i have to select on the basis of

ColC i.e.

SELECT * FROM TABLE_X WHERE

Colc between '2007-12-01 00:00:00' and '2008-02-29 23:59:59'

The total number of rows which will be retrieved using this query will be about 20 million.

In the Execution plan i see that a clustered index scan is performed.

The time duration which is being taken by this query is around 9 minutes which is not at all acceptable.

Please guide me how to do the tuning of this query.

Thanks

Amit

Don't take this as absolute gospel and you will need to test this on your data, but I'd do two things. First, drop the second index on Col1. If Col1 is the leading edge of your compound clustered index, having a non-clustered index for Col1 is a waste of time & space. Second, add an index to ColC. I know this is sample data and the real world is different, but if there are only the three columns, then I'd also include the other two columns on that index.

Test, test, test, test, test. And look at the execution plans.

--update. Saw a typo. I typed "add a column" I meant "add an index"

----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

jesse.hutchens SSC Enthusiast Points: 130 More actions · Answer 2

jesse.hutchens

SSC Enthusiast

Points: 130

March 14, 2008 at 11:09 am

#789771

Can you switch your Clustered Index to column c? That particular query would run a lot faster if you did.

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 3

Looking at just that scenario, and ignoring any possibility of other queries, I'd do the following.

Drop the NC index on Col1. As Grant said, it's not necessary. Make the PK nonclustered and move the clustered index to ColC.

Do you require all the columns in that query? (ie, do you need to do select *?)

If not, then as a second possibility, leave the PK as clustered and add a NC index on ColC with the columns you do need as INCLUDE columns. This may be quicker to implement than moving the clustered index which, on 2 billion rows, is not a trivial operation.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

John Rowan SSC Guru Points: 56440 More actions · Answer 4

Yes, having the clustered index on ColC would help. I would suggest dropping the second index on Col1 and recreating your PK as a non-clustered PK. You may want to consider putting the clustered index on ColC. Keep in mind that you will want to plan this as creating a new clustered index on a table of that size may take alot of time. Also, you've only shown us one query that hits this table. I suggest you analyze the impact a clustered index change would have accross all of your frequently used queries on this table.

John Rowan

======================================================
======================================================
Forum Etiquette: How to post data/code on a forum to get the best help[/url] - by Jeff Moden

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 5

Oh, and also. 20 million rows is not a trivial amount of data. Index seek or no index seek, it will take time to fetch that data from disk (if it's not in the data buffer) and it will take time to send that data to the client.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass