Clustered / NonClustered

Question

Post reply

Clustered / NonClustered

spaghettidba

SSC Guru

Points: 105712
More actions
October 18, 2010 at 11:01 am

#388033

I was asked to tune a database that is suffering from poor performance and I found something that I'm unsure on how to treat.

One of the main tables (13 million rows) is defined this way:

CREATE TABLE Receivings ( Receiving_Number int NOT NULL, Package_Number int NOT NULL, ... Job_Type char(4) NOT NULL, Job_Year int NOT NULL, Job_Number int NOT NULL, Job_Index char(1) NOT NULL, Job_Package int NOT NULL, ... (other 40 columns) ... PRIMARY KEY CLUSTERED (Receiving_Number, Package_Number) )

Looking at both the app code and index usage, the primary key is almost never used to access the table.

Instead, they're using the composite key (Job_Type, Job_Year, Job_Number, Job_Index, Job_Package).

Index usage stats for 24 hrs of activity contain something like this:

index_name user_seeks user_scans user_lookups ------------- ---------- ---------- ------------ PK_Receivings 24493 457 239399 IX_Job 1831779 171 0

I would be tempted to change the primary key to a nonclustered unique index and create the clustered index on the job fields to avoid those lookups.

Does it make sense to you?

-- Gianluca Sartori

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 1

Personal preference, probably not. Looks a little too wide for my tastes for a clustered index.

Check some of the queries doing the lookups, are they selecting majority of the columns of the table?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Leo.Miller SSChampion Points: 13035 More actions · Answer 2

Based on the little information provided I can't see that changing the indexes would help. You really need to see what queries are performing badly and then ID why those are a problem. In modern databases 13 million rows isn't relly that big, and the index you mention is being used for seeks. I'd start to worry about it if there were a lot of scans.

You also need to consider that changing the cluster may result in a lot of page splitting, where the current cluster may give you nice full pages.

Also ID where the resource contention is, CPU, RAM, Disk IOs.

Also are there regular index rebuilds and stats updates?

If you give a bit more info you will probably get some more detailed responses.

Leo

Leo
Nothing in life is ever so complicated that with a little work it can't be made more complicated.

spaghettidba SSC Guru Points: 105712 More actions · Answer 3

GilaMonster (10/18/2010)

Personal preference, probably not. Looks a little too wide for my tastes for a clustered index.

Check some of the queries doing the lookups, are they selecting majority of the columns of the table?

Thanks for the suggestion, Gail.

I forgot to mention that the job fields are nearly as selective as the primary key (12'665'254 distinct values over 13'207'302 rows) and that alone made me think it could be a good clustered index.

As per the clustering key width, it would be 17 bytes instead of 8, it's not that bad (is it?).

I got through some of the stored procedures (I couldn't check all of them, they're literally hundreds) and generally speaking the queries on this table select 5 to 10 columns with a filter / join predicate based on the job fields.

Also, the table is partitioned (6 partitions) on Receiving_Number, but just to avoid fragmentation. There are three special receivings (9999, 10000 and 20000) that get updated with new rows regularly, while all other receivings (with higher numbers) are inserted just once. To avoid fragmentation, the table is partitioned this way:

Partition 1: receivings 0 to 9998

Partition 2: receiving 9999

Partition 3: receiving 10000

Partition 4: receivings 10001 to 19999

Partition 5: receiving 20000

Partition 6: receivings 20001 +

Indexes are rebuilt/defragmented every night.

-- Gianluca Sartori

spaghettidba SSC Guru Points: 105712 More actions · Answer 4

Leo.Miller (10/18/2010)

In modern databases 13 million rows isn't relly that big, and the index you mention is being used for seeks. I'd start to worry about it if there were a lot of scans.

Thanks for your answer, Leo.

I agree with you that 13 million rows is not that big.

I find in the index usage stats lots of seeks, some scans, but nothing really worth worrying about. What I find inappropriate is the number of lookups, that made me think I need to change the clustered index.

Also ID where the resource contention is, CPU, RAM, Disk IOs.

Also are there regular index rebuilds and stats updates?

None of these resources is really critical. It just doesn't run as fast as the users would like to.

Indexes and stats are rebuilt/updated regularly.

Thanks for your help.

-- Gianluca Sartori

Oliiii SSCertifiable Points: 5328 More actions · Answer 5

Oliiii

SSCertifiable

Points: 5328

October 19, 2010 at 1:38 am

#1238862

17 Bytes is not that bad as long as you don't have many other indexes.

If most of the queries return the same columns you can also give a shot at including more columns in your index to cover most cases and reduce the lookups.

spaghettidba SSC Guru Points: 105712 More actions · Answer 6

Oliiii (10/19/2010)

... you can also give a shot at including more columns in your index to cover most cases and reduce the lookups.

This is a good point. I've tried to go that way, but it means analyzing the whole app code.

Missing index suggestions don't help much, as I can't build an index with 40 included columns. 🙂

-- Gianluca Sartori

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 7

Gianluca Sartori (10/19/2010)

Oliiii (10/19/2010)

Missing index suggestions don't help much, as I can't build an index with 40 included columns. 🙂

You can. It may not be a good idea, but you can...

17 bytes vs 8 = 9 extra bytes per nonclustered index per row. Bad? Only you can really tell that. Try it in a dev environment and see.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Leo.Miller SSChampion Points: 13035 More actions · Answer 8

As per the clustering key width, it would be 17 bytes instead of 8, it's not that bad (is it?).

..

Also, the table is partitioned (6 partitions) on Receiving_Number, but just to avoid fragmentation. There are three special receivings (9999, 10000 and 20000) that get updated with new rows regularly, while all other receivings (with higher numbers) are inserted just once. To avoid fragmentation, the table is partitioned this way:

Partition 1: receivings 0 to 9998

Partition 2: receiving 9999

Partition 3: receiving 10000

Partition 4: receivings 10001 to 19999

Partition 5: receiving 20000

Partition 6: receivings 20001 +

Indexes are rebuilt/defragmented every night.

From what you've said here it looks like the design and eventual use of the table don't match. I suspect the original design didn't expected this concentrated use of the Receiving_Number.

It would definitly be worth while testing to see if your idea works, but as Gail says you are adding a lot of extra index space here.

I still have one concern and that is: you haven't posted any execution plans or IO stats etc. showing where the time is being spent in the queries. The ratio of lookups on PK_Receiving / Seeks on IX_Job is only 1/7 which doesn't look too bad. Are you sure that that much time is being spent doing lookups?

Leo

Leo
Nothing in life is ever so complicated that with a little work it can't be made more complicated.

spaghettidba SSC Guru Points: 105712 More actions · Answer 9

Leo.Miller (10/19/2010)

I suspect the original design didn't expected this concentrated use of the Receiving_Number.

Leo, the original design didn't expect lots of things. 🙁 This is one of the worst databases I've ever seen.

Are you sure that that much time is being spent doing lookups?

This is what I see from the exec plans on the heaviest and most executed queries on this table.

I decided to include some columns to the Job index and things got better. The nonclustered seek/clustered lookup ratio got from 7.6 to 6.46.

Obviously I can't include all the fields required by these queries.

Thank you for your time and suggestions.

-- Gianluca Sartori

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 10

Gianluca Sartori (10/20/2010)

Are you sure that that much time is being spent doing lookups?

This is what I see from the exec plans on the heaviest and most executed queries on this table.

Any chance you can post a couple of those?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

spaghettidba SSC Guru Points: 105712 More actions · Answer 11

I have to admit that including more columns in the job index worked very well.

Most of the queries I saw yesterday performing badly got really better today.

An example of high lookup cost is in the attached plan. Unfortunately, this query is part of a big stored procedure and attaching the whole sqlplan would be a mess. I could only extract the picture for the single query plan. If you know how to extract a single query's plan from a procedure plan I'll be glad to post it.

Anyway the picture should be enough to understand what I mean:

Key Lookup plan

As you can see, there's a nonclustered seek (cost: 1%) + clustered key lookup (cost: 70%).

You will notice that table and column names are different from the ones I posted. I thought that translating those names in English would have been better for those trying to help...

Queries like this apart, I think I can be satisfied of the results for the moment.

I think I'll be monitoring performance in the next few days and, if users stop complaining, I won't tune it any further.

-- Gianluca Sartori

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 12

Gianluca Sartori (10/20/2010)

If you know how to extract a single query's plan from a procedure plan I'll be glad to post it.

Hmmm... Let me investigate when I get home

Anyway the picture should be enough to understand what I mean:

Key Lookup plan

What I want to see most is the properties of those two operators.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

spaghettidba SSC Guru Points: 105712 More actions · Answer 13

spaghettidba

SSC Guru

Points: 105712

October 20, 2010 at 6:14 am

#1239619

OK, I managed to extract a single query plan from the procedure plan, tweaking the xml in a text editor.

I can't believe there's no easier way to do it.

Here's the plan.

Ah, BTW, SSC attachments don't work. Grrrr!!!

-- Gianluca Sartori

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 14

Will try and remember to check this evening.

Testing attachments... looks like they're working now. Maybe intermittent glitch.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass