Index not used on Select *

Question

Post reply

Index not used on Select *

Farhan Jaffery

SSC Veteran

Points: 229
More actions
March 17, 2004 at 4:51 pm

#161118

I have a simple table with 6 columns. There is a non-clustered index on two of the columns (lets assume column B and C). The table has about 6 million rows. When I run a 'select *' query with where clauses on column B and C, the execution plan is showing a table scan, but when I specify a certain combination of select columns, like, 'select A, C', the plan shows the index seek. Any ideas as to why and does it have anything to do with statistics?

Thanks.

Viewing 15 posts - 1 through 15 (of 15 total)

You must be logged in to reply to this topic. Login to reply

wz700 SSCertifiable Points: 7133 More actions · Answer 1

wz700

SSCertifiable

Points: 7133

March 17, 2004 at 7:57 pm

#499243

Try to update statistics, then run the query again.

Johan Bijnens SSC Guru Points: 134873 More actions · Answer 2

- what's the filter-factor of the columns in your index ?

- what portion of data would be part of your query ?

Johan

Learn to play, play to learn !

Dont drive faster than your guardian angel can fly ...
but keeping both feet on the ground wont get you anywhere :w00t:

- How to post Performance Problems
- How to post data/code to get the best help[/url]

- How to prevent a sore throat after hours of presenting ppt

press F1 for solution, press shift+F1 for urgent solution 😀

Need a bit of Powershell? How about this

Who am I ? Sometimes this is me but most of the time this is me

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 3

It is not using the nonclustered index becuase you are returning everything. When you do select *, it would make no sense for it to go to the nonclustered index, find the values then return to the table to return everything. If you do a

"select columna, columnb from table

where columna = 'whatever'

and columnb = 'whatever'

And assuming you have your nonclustered index on columna,and columnb, it will then make sense to use the nonclustered index as it is now a covering index.

HTH

------------

Ray Higdon MCSE, MCDBA, CCNA

Carl B. SSCertifiable Points: 7931 More actions · Answer 4

Hello Ray,

I'm not aggree with you...

A query does not have to be "covered" by an index all the time.

Even if you do a SELECT * From a table and you have a restrictive WHERE clause (returning less than, lets say less than 10% of the rows of the table), proper noon-clustered indexes should be used and will be more efficient than doing a full table scan.

Regards,

Carl

rhunt SSCertifiable Points: 5868 More actions · Answer 5

Carl and Ray, I think you are both partially correct...

Carl, as you said, non-clustered indexes do not have to be covered to be used for a query. In fact, unless you really run the query quite often, and without a covered index the performance is substantially poorer, you should use fully covered indexes as infrequently as possible (on a high-transaction system). This of course goes out the window in a read-only DW.

Ray, if the optimizer has statistics that indicate that 80-90% of the rows will be returned from a table based on the WHERE clause, it will probably not use the a non-covered index to get the data.

I, of course, could be very wrong about all this... My optimizer often doesn't do what I expect it to. 😉

Ryan

Carl B. SSCertifiable Points: 7931 More actions · Answer 6

Hello Ryan,

My optimizer is like yours...

Just to be more acurate: I didn't say that queries that are covered by an index are not the best choice (in fact it is the best choice when you can achieve that). What I say is that queries does not have to be covered by a non-clustered index to use that index if the rows queried represent a small percentage of the total number of rows in the table. Also, I assume here that you have the proper non-clustered index for the query.

Best regard,

Carl

Farhan Jaffery SSC Veteran Points: 229 More actions · Answer 7

Ray -

Since there is no clustered index on this table, I would assume that data rows are stored sequntially. Regardless of what is in my select statement, as long as I have the right where clauses, the query engine should use index seek to get to the addresses of the desired rows and then get the information of all the columns for the select statement from the actual data page address location. FYI, the where clauses in my case return about 7.8% of data rows.

What do you mean by when you say that the optimizer will have to make an additional trip to get the other data requested in the select statement?

Carl B. SSCertifiable Points: 7931 More actions · Answer 8

Carl B.

SSCertifiable

Points: 7931

March 18, 2004 at 9:09 am

#499347

Farhan -

I've seen many bugs with the Optimizer. One of them was logged (3 years ago) to Microsoft and never solved .

Hope you are not facing the same bug we logged and Microsoft ignored.

Regards,

Carl

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 9

Whoa, I think a lot of stuf flying around here.

Does a nonclustered index have to be covering to be used? - No

"Since there is no clustered index on this table, I would assume that data rows are stored sequntially" - I don't follow what you mean.

Bottom line is performance, and I measure performance by time and the amount of logical IO my query costs, logical IO is the number of times the statement (SQL) has to touch an 8kb page, this is not the number of pages but the number of accesses to an 8kb page. To see your logical IO use the statement "set statistics IO on"

------------

Ray Higdon MCSE, MCDBA, CCNA

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 10

I would also argue that it depends on too many things to say when a nonclustered index will be used when doing a "select *", I would argue that it has less to do with statistics than it does your table architecture. Anytime you access a nonclustered index that is not a covering index you need to go back to the heap or clustered table to FIND all of the remaining returning columns.

------------

Ray Higdon MCSE, MCDBA, CCNA

Ian Yates SSCoach Points: 19738 More actions · Answer 11

Whilst it may not be totally relevant, you should always (according to several articles & threads on this site) have a clustered index on your table - then the data will be stored "in order". Many other benefits are also then offered to you in terms of speed and underlying structure for SQL to deal with. If your table does not have a clustered index then it is a heap - from memory they're not as good You may well find the optimiser works differently when your table has a clustered index?

What happens to your execution time & plan if you use an index hint to force SQL to use the index? If you force it to use the index, and it goes slower, stop worrying!

Brian Bare Grasshopper Points: 21 More actions · Answer 12

Fahran,

Just to make sure you understand the architecture, adn if you do you can skip this paragraph. A nonclustered index is a pointer to a data row that is stored in a heap or in a clustered index depending on whether the table has a clustere dindex or not.

The index is not being chosen because from an optimizer standpoint, it is determining that it is costlier to return the pointer to all of the rows then do an additional bookmark lookup to get the rest of the data you asked for by returning select * , as opposed to doing a table scan and finding the entire row directly.

I have seen this before and it often happens when the columns in the table are very many (usually >5 more than what is in the index) or very large (large char or text columns). Since select * is not good practice in a query or especially a stored proc, you may want to try specifying the column names even if you specify them all. Occasionally the * is what causes it.

You could take a look at the statistics in the cost factor of the retrieval step and then force an index hint and look at them that way. This should give you an idea why it thinks that is the right path. Just to note, never leave a hint in your code if you plan to keep using it in an application. They can do very bad things to performance.

Another option would be to update statistics but use a larger sampling than the default. I have noticed that the 10-25% default number sometimes will generate scewed statistics especially when the cardinality is not very selective on the columns.

Hope it helps,

Brian Bare, MCDBA/MCSE

Thanks,

Brian Bare

Antares686 SSC Guru Points: 125444 More actions · Answer 13

They are stored in the order they were entered with some exceptions but they are considered in heap as no deciding factor determines sort. The index actualy sroes the page number, file group and slot number of the record to find it.

Now I am curious exactly what your queries were and what the structure of you table is. Can you please post as this will bennifit us with details.

ray_higdon SSCarpal Tunnel Points: 4589 More actions · Answer 14

ray_higdon

SSCarpal Tunnel

Points: 4589

March 25, 2004 at 3:42 am

#500386

I agree with you antares unless there has been deletes on the table as heaps reclaim empty space

------------

Ray Higdon MCSE, MCDBA, CCNA