how long it takes to retrieve top 10000 records

Question

Post reply

how long it takes to retrieve top 10000 records

krishna.netblogs

SSC Veteran

Points: 244
More actions
July 24, 2012 at 11:25 pm

#278478

Hi All,

I have a table Tbl1 with the below properties

having 200 lakhs records.

30 Columns

6 columns are configured with Non-Unique,Non-Clustered Index

1 Column with Clustered Index

I have to execute the below query from this table Tbl1

select * from Tbl1

So how long it takes to retrieve top 10000 records?

And is there any metric to determine the time to take to retrieve 'n' no.of records

With Regards

Krishna

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Ant-Green SSC Guru Points: 112965 More actions · Answer 1

Ant-Green

SSC Guru

Points: 112965

July 25, 2012 at 1:59 am

#1517203

No there is no metric, it takes as long as it takes.

Depends on access methods, server load, blocking, locking etc.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 2

krishna.netblogs (7/24/2012)

Hi All,

I have a table Tbl1 with the below properties

having 200 lakhs records.

30 Columns

6 columns are configured with Non-Unique,Non-Clustered Index

1 Column with Clustered Index

I have to execute the below query from this table Tbl1

select * from Tbl1

So how long it takes to retrieve top 10000 records?

And is there any metric to determine the time to take to retrieve 'n' no.of records

With Regards

Krishna

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Phil Parkin SSC Guru Points: 245813 More actions · Answer 3

ChrisM@Work (7/25/2012)

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

I've just done a quick bit of testing.

select top 10 * from

produces exactly the same results and execution plan as

select top 10 * from

order by [PK columns]

so perhaps it is not as meaningless as you suggest?

If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

Ant-Green SSC Guru Points: 112965 More actions · Answer 4

Ant-Green

SSC Guru

Points: 112965

July 25, 2012 at 5:41 am

#1517288

The best way to word it is TOP is non-deterministic in that you may not get the same results twice without enforcing order.

All depends what data the engine reads first.

"There is no order unless you enforce order"

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 5

Phil Parkin (7/25/2012)

ChrisM@Work (7/25/2012)

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

I've just done a quick bit of testing.

select top 10 * from

produces exactly the same results and execution plan as

select top 10 * from
order by [PK columns]

so perhaps it is not as meaningless as you suggest?

Top is meaningless in language without an explicit or implicit range or order. "Top of the mountain" has an implicit order: the part of the mountain which has the highest value in yards from sea level. The range or order to which TOP is applied in TSQL is explicit. From BOL "If the query includes an ORDER BY clause, the first expression rows, or expression percent of rows, ordered by the ORDER BY clause are returned. If the query has no ORDER BY clause, the order of the rows is arbitrary. " 😉

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Cadavre SSC-Forever Points: 41690 More actions · Answer 6

ChrisM@Work (7/25/2012)

Phil Parkin (7/25/2012)

ChrisM@Work (7/25/2012)

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

I've just done a quick bit of testing.

select top 10 * from

produces exactly the same results and execution plan as

select top 10 * from
order by [PK columns]

so perhaps it is not as meaningless as you suggest?

Top is meaningless in language without an explicit or implicit range or order. "Top of the mountain" has an implicit order: the part of the mountain which has the highest value in yards from sea level. The range or order to which TOP is applied in TSQL is explicit. From BOL "If the query includes an ORDER BY clause, the first expression rows, or expression percent of rows, ordered by the ORDER BY clause are returned. If the query has no ORDER BY clause, the order of the rows is arbitrary. " 😉

Here's a nice link to further back that up --> http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx%5B/url%5D

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Gazareth One Orange Chip Points: 27737 More actions · Answer 7

Phil Parkin (7/25/2012)

ChrisM@Work (7/25/2012)

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

I've just done a quick bit of testing.

select top 10 * from

produces exactly the same results and execution plan as

select top 10 * from
order by [PK columns]

so perhaps it is not as meaningless as you suggest?

Try that on a table where PK <> Clustered Key 😀

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 8

Gazareth (7/25/2012)

Phil Parkin (7/25/2012)

ChrisM@Work (7/25/2012)

This looks very much like a homework question, in which case Anthony's reply is a good place to start.

Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

What else have you considered?

I've just done a quick bit of testing.

select top 10 * from

produces exactly the same results and execution plan as

select top 10 * from
order by [PK columns]

so perhaps it is not as meaningless as you suggest?

Try that on a table where PK <> Clustered Key 😀

Exactly! 😉

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Phil Parkin SSC Guru Points: 245813 More actions · Answer 9

Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word 🙂

If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 10

Phil Parkin (7/25/2012)

Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word 🙂

Quite right too! - But I'll stick with the word and continue to use it in this context.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Gazareth One Orange Chip Points: 27737 More actions · Answer 11

ChrisM@Work (7/25/2012)

Phil Parkin (7/25/2012)

Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word 🙂

Quite right too! - But I'll stick with the word and continue to use it in this context.

Meaningless is probably not quite right, as TOP (n) means you know you'll only get n rows (unless you're using PERCENT, WITH TIES etc etc).

Exactly which (n) rows you'll get though, that's another matter...

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 12

Gazareth (7/25/2012)

ChrisM@Work (7/25/2012)

Phil Parkin (7/25/2012)

Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word 🙂

Quite right too! - But I'll stick with the word and continue to use it in this context.

Meaningless is probably not quite right, as TOP (n) means you know you'll only get n rows (unless you're using PERCENT, WITH TIES etc etc).

Exactly which (n) rows you'll get though, that's another matter...

That's what TABLESAMPLE () is for.

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

Sean Lange SSC Guru Points: 286562 More actions · Answer 13

Phil Parkin (7/25/2012)

Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word 🙂

Maybe it could be reworded, albeit a bit more lengthy to something like: "Using top without an order by is like saying, Get me the first N rows that the engine feels is the easiest to retrieve". 🙂

So that would make it not meaningless and more accurate. The worst part is that it is most likely going to be consistent for a certain amount of data but will change at some point when sql decides it is easier to get some other rows.

_______________________________________________________________

Need help? Help us help you.

Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

derek.colley SSCrazy Eights Points: 8040 More actions · Answer 14

I would certainly get rid of the SELECT * and replace it with discrete column names.

This should simultaneously make it easier to profile/trace the query and map against performance, whilst minimising the data load brought back by the query.

I also suspect it removes the extra step of 'decoding' the * into column names.

I tested this just now against a 100,000-member single column integers table without a clustered index.

SELECT * FROM ... executed in 810ms.

SELECT number FROM ... executed in 722ms.

I flushed the procedure cache in between.

Plus of course if you're playing 'bait the DBA' just create a table with column name * 🙂

---

Note to developers:
CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
My blog: http://uksqldba.blogspot.com
Visit http://www.DerekColley.co.uk to find out more about me.