How are records sorted without an order by clause?

  • We have a client that claims their hosted app will not work correctly because the data is not returned in the correct order when they query it. Their DBA is not the sharpest knife in the drawer and refuses to consider the use of an ORDER BY clause to sort records. They simply want the records sorted correctly by default. I run the following query and view the execution plan:

    SELECT * FROM ITEMCHG

    WHERE RES = 001673

    The index used is called: IX_ITEMCHG_RES and here is the create statement

    CREATE INDEX [IX_ITEMCHG_RES] ON [dbo].[ITEMCHG]([RES]) WITH FILLFACTOR = 90 ON [PRIMARY]

    GO

    There is also a clustered index on the table for their primary key column.

    Now here is the kicker. On their production server when I run the above query the results are returned with no apparent order. The client claims their app is breaking because the results are not ordered by the primary key ASC. Again, using an order by clause is not an option for some reason.

    However, when they restore the backup of this database to another server the results ARE ordered by primary key ASC. I don't understand why there is a difference and am curious as to why. The "DBA" claims the issue is related to collation or a bad SQL Server 2000 install or something. I say just use an ORDER BY clause and be done with it but am curious as to what could cause the difference in ordering. Any ideas?

    Thanks,

    Chris

  • The order of a result set can only be guaranteed by using an ORDER BY. Although You might be able to get it to "work" again by rebuilding the indexes, indexes do not guarantee the results (see Jeff Moden's article for an example: http://qa.sqlservercentral.com/articles/Advanced+Querying/61716/), and using index hints doesn't allow the optimizer to determine the best way to run the query (your index might not always be the best one to use).

    If the results have to be in order, you need to either use an order by or sort the results on the client in whatever form of resultset receives the data. I don't know of anything else that will *guarantee* that the results are returned in order.

  • I have to agree with Chad in saying that the only way to gauruntee that the results are returned in a specified order is to use an order by clause, and it seems that this is the correct way to do it.

    However, I will point out that if you do a select into statement and then select from the newly created table, I have never seen a case where this fails to come back in the specified order. In others, what could be done behind the scenes is:

    Select *

    into table2

    from table1

    order by WhateverOrder

    drop table1

    sp_rename 'table2', 'table1'

    I would not recommend that for a lot of reasons though. I agree that the order by clause is the way to go if that is an option.

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • Sorry... I overlooked the part about the "client's hosted app"... does that mean it's a 3rd party app that the client cannot change?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?

    I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?

    ---
    Timothy A Wiseman
    SQL Blog: http://timothyawiseman.wordpress.com/

  • The client's DBA is a moron. He is free to recommend modifications that would require the application to use a SQL query using an ORDER BY clause or he is free to recommend that their in house application do client side sorting to address the problem. Yet because some server somewhere is returning the results in the order he and his application expects he claims that the problem with the query results not being returned in the order he expects is a problem with our hosted sql server instance.

  • here's what the server guarantees regarding ordering:

    http://blogs.msdn.com/sqltips/archive/2005/07/20/441053.aspx%5B/url%5D

    ---------------------------------------
    elsasoft.org

  • Sorry... I put some bum dope out :blush: and deleted this entry.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • However, even if the index hint is specified, there are ways to mess up the pseudo-ordering.

    Parallelism (parallel scans and the subsequent repartition/merge streams return the data in no particular order)

    Read uncommitted isolation level (as the storage engine has 2 options in this isolation level it doesn't have in any other - allocation order scans and merry-go-round scans)

    Hash joins/hash aggregates (if applicable)

    btw, Jeff, that's why the attempts to get a merry-go-round in your update article failed. Because it's an update it cannot be read uncommitted.

    Also, while forcing an index does have the desired effect in this version of SQL (and it works for NC indexes too), there's no official guarantee that it will do so after the next service pack/next version. Essentially, it's an undocumented feature.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • I recently read an article where Itzik Ben-Gan showed that a clustered index does not guarantee the order of records for a SELECT. It had to do with an additional INSERT if I am not mistaken.

    I'll try to find it again.


    N 56°04'39.16"
    E 12°55'05.25"

  • For proof of what read-uncommitted does to ordering...

    (borrowed and slightly modified from WesleyB's old blog

    CREATE TABLE tblClustered

    (ID int,

    MyDate smalldatetime,

    TestField char(2000))

    GO

    CREATE CLUSTERED INDEX ixID ON tblClustered (ID, MyDate)

    GO

    DECLARE @i int

    SET @i = 0

    WHILE @i < 1000

    BEGIN

    INSERT INTO tblClustered (ID, MyDate, TestField) VALUES (RAND() * 1000, CONVERT(varchar, getdate(), 112), REPLICATE('T', 50))

    SET @i = @i + 1

    END

    SELECT * FROM tblClustered with (index = 1, nolock)

    SELECT * FROM tblClustered with (index = 1)

    DROP TABLE tblClustered

    And let me see if I can get a parallelism one to work... I usually struggle to get the optimiser to pick a parallel plan

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html

    Itzik Ben-Gan

    Creating a clustered index on a table does not guarantee that the

    data is stored in the file in index key order. Data movement caused

    by page splits, changing index key values, and expanding dynamic

    columns generate logical fragmentation. When you create or rebuild

    an index on an existing table, SQL Server will make effort to create

    it in a contiguous manner (least amount of fragmentation), but there

    are no guarantees.


    N 56°04'39.16"
    E 12°55'05.25"

  • I have posted examples like this on SQLTeam several times that shows a select from a table with a clustered index that returns results not in cluster index order. If you want your query results in order, use an ORDER BY.

    create table #t (number int primary key clustered)

    insert into #t (number)

    select

    number

    from

    -- Number Table Function available here

    --http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685

    F_TABLE_NUMBER_RANGE(500,1000)

    order by

    number

    insert into #t (number)

    select

    number

    from

    F_TABLE_NUMBER_RANGE(100,199)

    order by

    number

    insert into #t (number)

    select

    number

    from

    F_TABLE_NUMBER_RANGE(1,50)

    order by

    number

    insert into #t (number)

    select

    number

    from

    F_TABLE_NUMBER_RANGE(300,499)

    order by

    number

    select * from #t

    Out of order results returned:

    (501 row(s) affected)

    (100 row(s) affected)

    (50 row(s) affected)

    (200 row(s) affected)

    number

    -----------

    500

    501

    502

    ... rows 503-995 omitted

    996

    997

    998

    999

    1000

    1

    2

    3

    4

    ... rows 5-45 omitted

    46

    47

    48

    49

    50

    100

    101

    102

    103

    104

    ... rows 105-194 omitted

    195

    196

    197

    198

    199

    300

    301

    302

    303

    304

    305

    ... rows 306-494 omitted

    495

    496

    497

    498

    499

    (851 row(s) affected)

  • http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html

    Which is what the test with the read uncommitted shows.

    If the isolation level is read committed or higher, and SQL has to scan the index it will scan in index page order. Essentially it will find the first page of the index leaf (the pages that stores the lowest index key value) and then will read along the index following the next-page pointer.

    The result of that is that the data will be returned from that in order of the index key. What happens to the data later during the query processing is another matter.

    In read uncommitted, the storage engine has an extra option. It can use an allocation order scan. That means it doesn't start with the page with the lowest index key, rather it starts with the page earliest in the data file (the one with the lowest PageID). SQL uses the IAM to find the pages that exist in the index and it reads them in the allocation order in the file, completely ignoring the index order.

    This is why (in certain cases) a read-uncommitted index scan can miss or double read pages.

    If the index is fragmented, then the order of pages in the file doesn't match the order order of page in the index and hence the data comes back 'unordered'

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • timothyawiseman (9/5/2008)


    Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?

    I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?

    Heh... I didn't say the GUI would be superior for sorting and, in fact, it frequently isn't superior. This type of question is always a very difficult question to answer on such limited information. My belief is that the DBA is trying to do the right thing, but may also not be taking other things into consideration...

    The problem the DBA has implied is, how would you indicate which sort to do?

    Hard coded - Very effective but is like adding ORDER BY to a View... it either limits what it can be used for to a rather specialized scope or the output has to be resorted to get a different sort or you have to have 1 sproc for sort order.

    Dynamically defined - Also very effective, but either requires some dynamic SQL or some decision making in the Order By. Either way, it'll be anywhere from "almost as fast" as the hardcoded method to dreadfully slow with a full recompile every time it's called.

    Regardless, both methods put an extra load on the server as sorting is a pretty expensive operation and that's probably why the DBA has dug his/her heels in. If the GUI or app makes a lot of calls to the server for sorted data, you end up with a lot of resource usage just sorting data. So, if you can off-load some of that to the client, then why not do it in the app?

    Do I think the DBA or the OP is correct in their argument? Dunno... not enough information on scalable performance COST to the server or the COST in dollars to fix, test, and redeploy the app. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply