SELECT TOP

  • Yes, the answer is "Ann". But when you say most of the time we would get “Ann” since the clustered index is created on “first name”.

    Why can’t we say it is always return the “Ann”?

    I did execute the query more than 10 times, all the time the result was “Ann”.

    Just trying to understand, in which scenario SQL Server return different result than “Ann”?

    [font="Calibri"]Raj[/font]
  • dfine (12/22/2010)


    Yes, the answer is "Ann". But when you say most of the time we would get “Ann” since the clustered index is created on “first name”.

    Why can’t we say it is always return the “Ann”?

    I did execute the query more than 10 times, all the time the result was “Ann”.

    Just trying to understand, in which scenario SQL Server return different result than “Ann”?

    For this small result set with simple key it will probably always return 'Ann'.

    But when you are talking about very large tables with a clustered index, the output might not be predictable.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Ninja's_RGR'us (12/21/2010)


    So which was better performance wise? maxdop 1 or 0?

    Yes I know the results are wrong without the order by!

    well for ad-hoc examination it was much better to leave off the order-by and remember that the results had sub-sections in nondeterministic order.

    otherwise the answer is the typical "it depends" - on tradeoff between sorting a huge flat resultset and the time it takes to select each part with multiple CPU.

  • I got the question right, assuming that the submitter was unaware of the fact that there is no guarantee for this behaviour. I was then surprised to see in the explanation that he is in fact aware of this, and still marked Ann as the correct answer. I don't get this - basically, the explanation says that the correct answer is "undetermined", but that answer is not given.

    A weak question.

    ronmoses (12/21/2010)


    For educational purposes, I would genuinely appreciate it if one of the folks who take issue with the "most of the time" factor could illustrate a scenario in which that script returns different results.

    Thanks!

    Ron

    Read this blog post I wrote a few years ago.

    For this specific scenario, without any changes to the code, the data, etc, I don't think we can make this script return different results. But Microsoft can. Since they didn't document or otherwise guarantee this behaviour, future versions of SQL Server, or even service packs and hotfixes, may change this behaviour. Remember what happened to all those views that abused the pre-SQL2005 "TOP 100 PERCENT ... ORDER BY" behaviour? And what about GROUP BY without ORDER BY in SQL 6.5 and older?


    Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
    Visit my SQL Server blog: https://sqlserverfast.com/blog/
    SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

  • mtassin (12/21/2010)


    Always use the top with an ORDER BY clause!

    I got the question correct, but this is the last sentance of the answer's explanation.

    Why?

    Why should I always use TOP with an order by clause? What benefit do I get with say TOP 100% when I decide i want all the records to come back, just ordered?

    I don't think anyone gave a direct answer to this (although someone else pointed out the same ambiguous wording.) The author meant you should always use top with an order by clause. In other words, don't use top without an order by clause. The author was not saying you should always use top when you are using an order by clause.

  • Hugo Kornelis (12/23/2010)


    I got the question right, assuming that the submitter was unaware of the fact that there is no guarantee for this behaviour. I was then surprised to see in the explanation that he is in fact aware of this, and still marked Ann as the correct answer. I don't get this - basically, the explanation says that the correct answer is "undetermined", but that answer is not given.

    A weak question.

    I have exactly the same opinion. Ann was the only reasonable option available, and I was expecting to have to mention that it wouldn't be guaranteed, but then he does it himself.

  • mtassin (12/21/2010)


    These days I begin to wonder if the optimizer doesn't just return them in clustered index order when the ORDER BY clause is omitted (and a clustered index is present), and Microsoft lists in BOL that the order is arbitrary so that if they need to change the optimizer for some reason with a service pack, they can say "We told you it was arbitrary".

    No, the optimizer (and storage engine) can choose whatever access path seems most efficient. For example:

    CREATE TABLE dbo.Example

    (

    row_id INTEGER IDENTITY NOT NULL PRIMARY KEY CLUSTERED,

    data AS CONVERT(VARCHAR(11), row_id) PERSISTED NOT NULL UNIQUE

    );

    GO

    INSERT dbo.Example

    DEFAULT VALUES;

    GO 10

    SELECT E.row_id, E.data

    FROM dbo.Example E;

    GO

    DROP TABLE dbo.Example;

    Even on a simple scan of the clustered index, there are a number of ways to show that ordering isn't guaranteed. Using parallelism is the most popular one, but this works too:

    CREATE TABLE dbo.Example

    (

    data BIGINT NOT NULL,

    padding CHAR(5000) NOT NULL DEFAULT ''

    );

    GO

    CREATE CLUSTERED INDEX c

    ON dbo.Example (data)

    WITH (ALLOW_ROW_LOCKS = OFF, ALLOW_PAGE_LOCKS = OFF);

    GO

    INSERT dbo.Example (data) VALUES (CHECKSUM(NEWID()));

    GO 64

    -- Clustered index scan, but not in clustered index order!

    SELECT TOP (10)

    data

    FROM dbo.Example

    GO

    DROP TABLE dbo.Example;

  • Just to point out that this code:

    IF EXISTS (SELECT * FROM tempdb..sysobjects WHERE name like '#Customer%')

    DROP TABLE #Customer

    is NOT the proper way to test for the existence of a temporary table (create the temporary table in a different connection, and you'll get an error when you run this code!)

    The proper way to test for the existence of a temporary table is:

    IF OBJECT_ID('tempdb..#Customer') IS NOT NULL

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • ronmoses (12/21/2010)


    For educational purposes, I would genuinely appreciate it if one of the folks who take issue with the "most of the time" factor could illustrate a scenario in which that script returns different results.

    Thanks!

    Ron

    Does this suffice?

    IF OBJECT_ID('tempdb..#Test') IS NOT NULL DROP TABLE #Test;

    CREATE TABLE #Test (RowID INT PRIMARY KEY CLUSTERED);

    ;WITH

    TENS (N) AS (SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL

    SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0 UNION ALL SELECT 0),

    THOUSANDS (N) AS (SELECT 1 FROM TENS t1 CROSS JOIN TENS t2 CROSS JOIN TENS t3),

    MILLIONS (N) AS (SELECT 1 FROM THOUSANDS t1 CROSS JOIN THOUSANDS t2),

    TALLY (N) AS (SELECT ROW_NUMBER() OVER (ORDER BY (SELECT 0)) FROM MILLIONS)

    INSERT INTO #Test

    SELECT N FROM TALLY;

    SELECT TOP (5) * FROM #Test WHERE RowID > 5000;

    SELECT TOP (5) * FROM #Test WHERE RowID > 7000;

    SELECT TOP (5) * FROM #Test WHERE RowID > 9000;

    SELECT * FROM #Test;

    IF OBJECT_ID('tempdb..#Test') IS NOT NULL DROP TABLE #Test;

    On my system, the last select starts with 328417.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • WayneS

    On my system, the last select starts with 328417.

    Sorry Wayne,

    On my system I get:

    SELECT TOP (5) * FROM #Test WHERE RowID > 5000; - 5001

    SELECT TOP (5) * FROM #Test WHERE RowID > 7000; - 7001

    SELECT TOP (5) * FROM #Test WHERE RowID > 9000; - 9001

    SELECT * FROM #Test; - 1

    Steve Jimmo
    Sr DBA
    “If we ever forget that we are One Nation Under God, then we will be a Nation gone under." - Ronald Reagan

  • sjimmo (12/29/2010)


    WayneS

    On my system, the last select starts with 328417.

    Sorry Wayne,

    On my system I get:

    SELECT TOP (5) * FROM #Test WHERE RowID > 5000; - 5001

    SELECT TOP (5) * FROM #Test WHERE RowID > 7000; - 7001

    SELECT TOP (5) * FROM #Test WHERE RowID > 9000; - 9001

    SELECT * FROM #Test; - 1

    And on the one I just tested it on (SQL 2008 Express Edition) I got 283633. Interestingly, second time I got 915585--seems the outcome is largely random!

  • WayneS (12/29/2010)


    Just to point out that this code:

    IF EXISTS (SELECT * FROM tempdb..sysobjects WHERE name like '#Customer%')

    DROP TABLE #Customer

    is NOT the proper way to test for the existence of a temporary table (create the temporary table in a different connection, and you'll get an error when you run this code!)

    The proper way to test for the existence of a temporary table is:

    IF OBJECT_ID('tempdb..#Customer') IS NOT NULL

    Better still...?

    IF OBJECT_ID(N'tempdb..#Customer', N'U') IS NOT NULL

    (The original form would return an object id if there were a temporary procedure called #Customer)

  • paul.knibbs (12/29/2010)


    And on the one I just tested it on (SQL 2008 Express Edition) I got 283633. Interestingly, second time I got 915585--seems the outcome is largely random!

    I assume you are both getting plans involving parallelism?

    http://blogs.msdn.com/b/conor_cunningham_msft/archive/2008/08/27/no-seatbelt-expecting-order-without-order-by.aspx

  • The correct answer should be "it depends". As already stated, there are no guarantees even it seem to work in some cases, but with larger tables, parallell plans, many users and enterprise edition (advanced read ahead) you can't rely on the order of the result.

    ALWAYS use the ORDER BY clause if you depend on the order.

    /Håkan Winther
    MCITP:Database Developer 2008
    MCTS: SQL Server 2008, Implementation and Maintenance
    MCSE: Data Platform

  • hakan.winther (8/8/2011)


    The correct answer should be "it depends". As already stated, there are no guarantees even it seem to work in some cases, but with larger tables, parallell plans, many users and enterprise edition (advanced read ahead) you can't rely on the order of the result.

    ALWAYS use the ORDER BY clause if you depend on the order.

    I think your opinion is little wrong.

    Result of order depend on method to get result

    1. Index seek

    in this case relation engine use ROOT PAGE to start search. And result will be sort

    Little trick for force seek use hint (ForceSEEK) and add search predicate in WHERE clause

    2. Index scan

    If index scan is used then relation engine get leaf level pages in order that is stored in Index Allocation Map This order may be any 😉 and depend on free pages allocation at the moment of creation index

    Please get DBCC IND, DBCC PAGE into your hands and verify this.

    MS SQL 2008 MCITP x 3
    MS SQL 2012 MCSE x 2

Viewing 15 posts - 31 through 45 (of 50 total)

You must be logged in to reply to this topic. Login to reply