how long it takes to retrieve top 10000 records

  • Hi All,

    I have a table Tbl1 with the below properties

    having 200 lakhs records.

    30 Columns

    6 columns are configured with Non-Unique,Non-Clustered Index

    1 Column with Clustered Index

    I have to execute the below query from this table Tbl1

    select * from Tbl1

    So how long it takes to retrieve top 10000 records?

    And is there any metric to determine the time to take to retrieve 'n' no.of records

    With Regards

    Krishna

  • No there is no metric, it takes as long as it takes.

    Depends on access methods, server load, blocking, locking etc.

  • krishna.netblogs (7/24/2012)


    Hi All,

    I have a table Tbl1 with the below properties

    having 200 lakhs records.

    30 Columns

    6 columns are configured with Non-Unique,Non-Clustered Index

    1 Column with Clustered Index

    I have to execute the below query from this table Tbl1

    select * from Tbl1

    So how long it takes to retrieve top 10000 records?

    And is there any metric to determine the time to take to retrieve 'n' no.of records

    With Regards

    Krishna

    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (7/25/2012)


    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    I've just done a quick bit of testing.

    select top 10 * from

    produces exactly the same results and execution plan as

    select top 10 * from

    order by [PK columns]

    so perhaps it is not as meaningless as you suggest?

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • The best way to word it is TOP is non-deterministic in that you may not get the same results twice without enforcing order.

    All depends what data the engine reads first.

    "There is no order unless you enforce order"

  • Phil Parkin (7/25/2012)


    ChrisM@Work (7/25/2012)


    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    I've just done a quick bit of testing.

    select top 10 * from

    produces exactly the same results and execution plan as

    select top 10 * from

    order by [PK columns]

    so perhaps it is not as meaningless as you suggest?

    Top is meaningless in language without an explicit or implicit range or order. "Top of the mountain" has an implicit order: the part of the mountain which has the highest value in yards from sea level. The range or order to which TOP is applied in TSQL is explicit. From BOL "If the query includes an ORDER BY clause, the first expression rows, or expression percent of rows, ordered by the ORDER BY clause are returned. If the query has no ORDER BY clause, the order of the rows is arbitrary. " πŸ˜‰

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (7/25/2012)


    Phil Parkin (7/25/2012)


    ChrisM@Work (7/25/2012)


    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    I've just done a quick bit of testing.

    select top 10 * from

    produces exactly the same results and execution plan as

    select top 10 * from

    order by [PK columns]

    so perhaps it is not as meaningless as you suggest?

    Top is meaningless in language without an explicit or implicit range or order. "Top of the mountain" has an implicit order: the part of the mountain which has the highest value in yards from sea level. The range or order to which TOP is applied in TSQL is explicit. From BOL "If the query includes an ORDER BY clause, the first expression rows, or expression percent of rows, ordered by the ORDER BY clause are returned. If the query has no ORDER BY clause, the order of the rows is arbitrary. " πŸ˜‰

    Here's a nice link to further back that up --> http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx%5B/url%5D


    Forever trying to learn
    My blog - http://www.cadavre.co.uk/
    For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
    For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Phil Parkin (7/25/2012)


    ChrisM@Work (7/25/2012)


    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    I've just done a quick bit of testing.

    select top 10 * from

    produces exactly the same results and execution plan as

    select top 10 * from

    order by [PK columns]

    so perhaps it is not as meaningless as you suggest?

    Try that on a table where PK <> Clustered Key πŸ˜€

  • Gazareth (7/25/2012)


    Phil Parkin (7/25/2012)


    ChrisM@Work (7/25/2012)


    This looks very much like a homework question, in which case Anthony's reply is a good place to start.

    Don't forget that TOP is meaningless without ORDER BY, and if the column(s) in the order by is/are not indexed, SQL Server will have to scan the whole table looking for the top 10000 rows. This is likely to be the most significant factor affecting the retrieval time. You should also consider the fate of the rows - where are they going to? If they are going to a network client then the network speed may be significant.

    What else have you considered?

    I've just done a quick bit of testing.

    select top 10 * from

    produces exactly the same results and execution plan as

    select top 10 * from

    order by [PK columns]

    so perhaps it is not as meaningless as you suggest?

    Try that on a table where PK <> Clustered Key πŸ˜€

    Exactly! πŸ˜‰

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word πŸ™‚

    If you haven't even tried to resolve your issue, please don't expect the hard-working volunteers here to waste their time providing links to answers which you could easily have found yourself.

  • Phil Parkin (7/25/2012)


    Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word πŸ™‚

    Quite right too! - But I'll stick with the word and continue to use it in this context.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (7/25/2012)


    Phil Parkin (7/25/2012)


    Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word πŸ™‚

    Quite right too! - But I'll stick with the word and continue to use it in this context.

    Meaningless is probably not quite right, as TOP (n) means you know you'll only get n rows (unless you're using PERCENT, WITH TIES etc etc).

    Exactly which (n) rows you'll get though, that's another matter...

  • Gazareth (7/25/2012)


    ChrisM@Work (7/25/2012)


    Phil Parkin (7/25/2012)


    Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word πŸ™‚

    Quite right too! - But I'll stick with the word and continue to use it in this context.

    Meaningless is probably not quite right, as TOP (n) means you know you'll only get n rows (unless you're using PERCENT, WITH TIES etc etc).

    Exactly which (n) rows you'll get though, that's another matter...

    That's what TABLESAMPLE () is for.

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Phil Parkin (7/25/2012)


    Nice article - thanks for the link. When I read absolutes (like 'meaningless') my brain subconsciously challenges them. I still think it was the wrong word πŸ™‚

    Maybe it could be reworded, albeit a bit more lengthy to something like: "Using top without an order by is like saying, Get me the first N rows that the engine feels is the easiest to retrieve". πŸ™‚

    So that would make it not meaningless and more accurate. The worst part is that it is most likely going to be consistent for a certain amount of data but will change at some point when sql decides it is easier to get some other rows.

    _______________________________________________________________

    Need help? Help us help you.

    Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.

    Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.

    Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
    Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
    Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
    Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/

  • I would certainly get rid of the SELECT * and replace it with discrete column names.

    This should simultaneously make it easier to profile/trace the query and map against performance, whilst minimising the data load brought back by the query.

    I also suspect it removes the extra step of 'decoding' the * into column names.

    I tested this just now against a 100,000-member single column integers table without a clustered index.

    SELECT * FROM ... executed in 810ms.

    SELECT number FROM ... executed in 722ms.

    I flushed the procedure cache in between.

    Plus of course if you're playing 'bait the DBA' just create a table with column name * πŸ™‚

    ---

    Note to developers:
    CAST(SUBSTRING(CAST(FLOOR(NULLIF(ISNULL(COALESCE(1,NULL),NULL),NULL)) AS CHAR(1)),1,1) AS INT) == 1
    So why complicate your code AND MAKE MY JOB HARDER??!:crazy:

    Want to get the best help? Click here https://www.sqlservercentral.com/articles/forum-etiquette-how-to-post-datacode-on-a-forum-to-get-the-best-help (Jeff Moden)
    My blog: http://uksqldba.blogspot.com
    Visit http://www.DerekColley.co.uk to find out more about me.

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply