Certain Number of rows before SQL will use an index ?

Question

Post reply

Certain Number of rows before SQL will use an index ?

Simon_L

SSCarpal Tunnel

Points: 4669
More actions
December 5, 2008 at 9:17 am

#126189

Am I losing the plot or....

Is there a row count threshold that has to be passed before SQL will start using indexes clustered or non-clustered, rather than a table scan ?

e.g if the table has say 100 rows will it still use table scans despite there being indexes present ?

thanks simon

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply

Jerry Hung SSChampion Points: 12908 More actions · Answer 1

Jerry Hung

SSChampion

Points: 12908

December 5, 2008 at 1:03 pm

#908622

You have to ask the Query Optimizer 😛

For small tables (like 100 rows), it may decide it's quicker & cheaper to scan the whole table

For big tables, depending on the query, it could use Indexes or Table Scan

Answer is = it depends

SQLServerNewbie MCITP: Database Administrator SQL Server 2005

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 2

Depends on the table, depends on the query, depends on the index.

e.g if the table has say 100 rows will it still use table scans despite there being indexes present ?

If all 100 rows are in one data page, which is faster, to scan the entire table (1 page) or to use the index to seek (minimum 2 pages)?

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Luke L SSC-Dedicated Points: 35545 More actions · Answer 3

Just to add to what Jerry already said...

If you are using columns in your Select statement that are not part of your index in one way or another and the optimizer needs to do a bookmark lookup, with small tables like that one the table scan would be faster.

I.E. you got the following



Create Table #myUsers (

username varchar(25) Primary Key,

FName Varchar(30),

LName Varchar(50)

)

And the only index you have is a PK on Username but your query is something like



Select FName, LName

FROM #MyUsers

WHERE username = 'bgates'

a table scan would be cheaper than the lookup that would be needed to find it in the index and then use the bookmark to get the rest of the data from the row, depending on the number of rows and 20 other factors.

there is no magic number when the optimizer will prefer one way vs. another, it takes into account all manner of things including you stats and event he available memory and/or CPU cycles. If it can do the operation 2 ways, one which will take more memory and one which will take more CPU cycles and there is less contention for cpu than memory, it will use the plan that takes up more CPU and less memory. The next tiem it runs the opposite may happen.

As Jerry said, It Depends and as always YMMV.

-Luke.

To help us help you read this[/url]For better help with performance problems please read this[/url]

Simon_L SSCarpal Tunnel Points: 4669 More actions · Answer 4

Simon_L

SSCarpal Tunnel

Points: 4669

December 8, 2008 at 2:25 am

#909034

thank you all very much

food for thought ...

best wishes

~simon

Mike Durnwald SSC Veteran Points: 236 More actions · Answer 5

Mike Durnwald

SSC Veteran

Points: 236

December 9, 2008 at 6:38 am

#909591

I thought if a table was less than 8 pages, it won't ever use an index...

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 6

lookup the INDEX= hint for the FROM clause in BOL. You can force a query to use a particular index. Now you can compare the query cost (show actual execution plan) and the IOs required (set statistics IO on) for each flavor of the query:

select *

from mytable WITH (INDEX=0) --force table scan, which is probably unnecessary for small tables

select *

from mytable WITH (INDEX=myNCindex) --force index usage

Ballpark the optimizer will switch to scan when it estimates you are asking for somewhere between 8 and 15% of the total number of rows in the table.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Simon_L SSCarpal Tunnel Points: 4669 More actions · Answer 7

Simon_L

SSCarpal Tunnel

Points: 4669

December 9, 2008 at 7:51 am

#909655

Mike Durnwald (12/9/2008)

I thought if a table was less than 8 pages, it won't ever use an index...

I had heard 1000 rows but it would appear by the knowledgable answers in this thread thats wrong ! 😀

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 8

The optimizer is cost-based. An index seek/bookmark lookup of a given depth takes X cost per row. A table (or clustered-index range) scan takes Y cost per page. Based on estimated rows retrieved it is pretty simple math to determine in this case which cost is less. Note that X does NOT equal Y here. For those inquisitive readers, Joe Chang has some amazing details available online from previous work he has done.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Simon_L SSCarpal Tunnel Points: 4669 More actions · Answer 9

TheSQLGuru (12/9/2008)

The optimizer is cost-based. An index seek/bookmark lookup of a given depth takes X cost per row. A table (or clustered-index range) scan takes Y cost per page. Based on estimated rows retrieved it is pretty simple math to determine in this case which cost is less. Note that X does NOT equal Y here. For those inquisitive readers, Joe Chang has some amazing details available online from previous work he has done.

thank you very much 🙂

simon

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 10

TheSQLGuru (12/9/2008)

Ballpark the optimizer will switch to scan when it estimates you are asking for somewhere between 8 and 15% of the total number of rows in the table.

Depends on the table. I've had it switch to a cluster/table scan with under 1% of the rows queried. With a non-covering index, of course.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 11

GilaMonster (12/9/2008)

TheSQLGuru (12/9/2008)

Ballpark the optimizer will switch to scan when it estimates you are asking for somewhere between 8 and 15% of the total number of rows in the table.

Depends on the table. I've had it switch to a cluster/table scan with under 1% of the rows queried. With a non-covering index, of course.

1) I wonder if that was actual rows returned or estimated?

2) I also wonder about the size of the table, like 100 rows or 100M rows and did it have bloated data size?

3) I should have specifically excluded covering-index situations. 🙂

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 12

TheSQLGuru (12/9/2008)

1) I wonder if that was actual rows returned or estimated?

Estimated. The optimiser works with estimated row counts and that's what it uses for its costings and plan. It doesn't know, at optimisation time, what the actual row count is going to be and hence can't cost and optimise based on the actual row count.

That said, in test cases that I've investigated with the stats up to date the two are the same

2) I also wonder about the size of the table, like 100 rows or 100M rows and did it have bloated data size?

Small, large, doesn't seem to matter. Though the larger the table, the smaller the break point seems to be. Not a large shift in the % though.

Some setup first

CREATE TABLE [dbo].[SeekOrScan](



	[ID] [int] IDENTITY(1,1) NOT NULL primary key,

	SomeNumber int,

	padding char(100)

) ON [PRIMARY]

CREATE NonCLUSTERED INDEX [idx_SeekOrScan] ON [dbo].[SeekOrScan] (SomeNumber)

First test - 1000 rows

insert into [SeekOrScan] (SomeNumber)



select top 1000 number

from master..spt_values

where name is null

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 0 and 100 -- 10% of table

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 0 and 50 -- 5% of table

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 0 and 5 -- 0.6% of table

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 0 and 4  -- 0.5% of table

-- nonclustered index seek

Break point was around 0.5% of the total rows.

1 000 000 rows



truncate table SeekOrScan

insert into [SeekOrScan] (SomeNumber)

select top 1000000 0

from master..spt_values a cross join master..spt_values b

where a.name is null and b.name is null

update [SeekOrScan] set SomeNumber = ID -- just so we've got sequential numbers for the between

GO

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 1 and 100000 -- 10% of table

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 1 and 50000 -- 5% of table

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 1 and 10000 -- 1%

-- Clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 1 and 5000  -- 0.5% of table

-- clustered index scan

dbcc freeproccache

select * from [SeekOrScan]

where somenumber between 1 and 3000  -- 0.3% of table

-- nonclustered index seek

Breakpoint here was between 0.3 and 0.4% of the table.

For what it's worth, I've never seen the breakpoint at 15%. I don't think I've ever seen it as high as 10%

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

TheSQLGuru SSC Guru Points: 134017 More actions · Answer 13

Thanks for the repro Gail! Clearly either my memory is faulty (wouldn't be the first time and it seems to be happening more and more since the birth of my daughter 3+ years ago :w00t:) or I am remembering metrics from a no-longer-existent version of the SQL Server.

Best,
Kevin G. Boles
SQL Server Consultant
SQL MVP 2007-2012
TheSQLGuru on googles mail service