How are records sorted without an order by clause?

Question

Post reply

How are records sorted without an order by clause?

cbrinson

SSC Eights!

Points: 802
More actions
September 5, 2008 at 8:31 am

#192476

We have a client that claims their hosted app will not work correctly because the data is not returned in the correct order when they query it. Their DBA is not the sharpest knife in the drawer and refuses to consider the use of an ORDER BY clause to sort records. They simply want the records sorted correctly by default. I run the following query and view the execution plan:

SELECT * FROM ITEMCHG

WHERE RES = 001673

The index used is called: IX_ITEMCHG_RES and here is the create statement

CREATE INDEX [IX_ITEMCHG_RES] ON [dbo].[ITEMCHG]([RES]) WITH FILLFACTOR = 90 ON [PRIMARY]

GO

There is also a clustered index on the table for their primary key column.

Now here is the kicker. On their production server when I run the above query the results are returned with no apparent order. The client claims their app is breaking because the results are not ordered by the primary key ASC. Again, using an order by clause is not an option for some reason.

However, when they restore the backup of this database to another server the results ARE ordered by primary key ASC. I don't understand why there is a difference and am curious as to why. The "DBA" claims the issue is related to collation or a bad SQL Server 2000 install or something. I say just use an ORDER BY clause and be done with it but am curious as to what could cause the difference in ordering. Any ideas?

Thanks,

Chris

Viewing 15 posts - 1 through 15 (of 21 total)

You must be logged in to reply to this topic. Login to reply

Chad Crawford SSChampion Points: 11610 More actions · Answer 1

The order of a result set can only be guaranteed by using an ORDER BY. Although You might be able to get it to "work" again by rebuilding the indexes, indexes do not guarantee the results (see Jeff Moden's article for an example: http://qa.sqlservercentral.com/articles/Advanced+Querying/61716/), and using index hints doesn't allow the optimizer to determine the best way to run the query (your index might not always be the best one to use).

If the results have to be in order, you need to either use an order by or sort the results on the client in whatever form of resultset receives the data. I don't know of anything else that will *guarantee* that the results are returned in order.

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 2

I have to agree with Chad in saying that the only way to gauruntee that the results are returned in a specified order is to use an order by clause, and it seems that this is the correct way to do it.

However, I will point out that if you do a select into statement and then select from the newly created table, I have never seen a case where this fails to come back in the specified order. In others, what could be done behind the scenes is:

Select *

into table2

from table1

order by WhateverOrder

drop table1

sp_rename 'table2', 'table1'

I would not recommend that for a lot of reasons though. I agree that the order by clause is the way to go if that is an option.

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 3

Sorry... I overlooked the part about the "client's hosted app"... does that mean it's a 3rd party app that the client cannot change?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

TimothyAWiseman SSCrazy Eights Points: 8819 More actions · Answer 4

TimothyAWiseman

SSCrazy Eights

Points: 8819

September 5, 2008 at 10:53 pm

#868111

Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?

I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?

---
Timothy A Wiseman
SQL Blog: http://timothyawiseman.wordpress.com/

cbrinson SSC Eights! Points: 802 More actions · Answer 5

The client's DBA is a moron. He is free to recommend modifications that would require the application to use a SQL query using an ORDER BY clause or he is free to recommend that their in house application do client side sorting to address the problem. Yet because some server somewhere is returning the results in the order he and his application expects he claims that the problem with the query results not being returned in the order he expects is a problem with our hosted sql server instance.

jezemine SSCrazy Eights Points: 8059 More actions · Answer 6

jezemine

SSCrazy Eights

Points: 8059

September 6, 2008 at 12:15 am

#868116

here's what the server guarantees regarding ordering:

http://blogs.msdn.com/sqltips/archive/2005/07/20/441053.aspx%5B/url%5D

---------------------------------------
elsasoft.org

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 7

Sorry... I put some bum dope out :blush: and deleted this entry.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 8

However, even if the index hint is specified, there are ways to mess up the pseudo-ordering.

Parallelism (parallel scans and the subsequent repartition/merge streams return the data in no particular order)

Read uncommitted isolation level (as the storage engine has 2 options in this isolation level it doesn't have in any other - allocation order scans and merry-go-round scans)

Hash joins/hash aggregates (if applicable)

btw, Jeff, that's why the attempts to get a merry-go-round in your update article failed. Because it's an update it cannot be read uncommitted.

Also, while forcing an index does have the desired effect in this version of SQL (and it works for NC indexes too), there's no official guarantee that it will do so after the next service pack/next version. Essentially, it's an undocumented feature.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

SwePeso SSC-Dedicated Points: 39725 More actions · Answer 9

SwePeso

SSC-Dedicated

Points: 39725

September 6, 2008 at 8:53 am

#868147

I recently read an article where Itzik Ben-Gan showed that a clustered index does not guarantee the order of records for a SELECT. It had to do with an additional INSERT if I am not mistaken.

I'll try to find it again.

N 56°04'39.16"
E 12°55'05.25"

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 10

For proof of what read-uncommitted does to ordering...

(borrowed and slightly modified from WesleyB's old blog

CREATE TABLE tblClustered



(ID int,

MyDate smalldatetime,

TestField char(2000))

GO

CREATE CLUSTERED INDEX ixID ON tblClustered (ID, MyDate)

GO

DECLARE @i int

SET @i = 0

WHILE @i < 1000

BEGIN

INSERT INTO tblClustered (ID, MyDate, TestField) VALUES (RAND() * 1000, CONVERT(varchar, getdate(), 112), REPLICATE('T', 50))

SET @i = @i + 1

END

SELECT * FROM tblClustered with (index = 1, nolock)

SELECT * FROM tblClustered with (index = 1)

DROP TABLE tblClustered

And let me see if I can get a parallelism one to work... I usually struggle to get the optimiser to pick a parallel plan

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

SwePeso SSC-Dedicated Points: 39725 More actions · Answer 11

http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html

Itzik Ben-Gan

Creating a clustered index on a table does not guarantee that the

data is stored in the file in index key order. Data movement caused

by page splits, changing index key values, and expanding dynamic

columns generate logical fragmentation. When you create or rebuild

an index on an existing table, SQL Server will make effort to create

it in a contiguous manner (least amount of fragmentation), but there

are no guarantees.

N 56°04'39.16"
E 12°55'05.25"

Michael Valentine Jones SSC Guru Points: 64818 More actions · Answer 12

I have posted examples like this on SQLTeam several times that shows a select from a table with a clustered index that returns results not in cluster index order. If you want your query results in order, use an ORDER BY.



create table #t (number int primary key clustered)

insert into #t (number)

select

	number

from

	-- Number Table Function available here

	--http://www.sqlteam.com/forums/topic.asp?TOPIC_ID=47685

	F_TABLE_NUMBER_RANGE(500,1000)

order by

	number

insert into #t (number)

select

	number

from

	F_TABLE_NUMBER_RANGE(100,199)

order by

	number

insert into #t (number)

select

	number

from

	F_TABLE_NUMBER_RANGE(1,50)

order by

	number

insert into #t (number)

select

	number

from

	F_TABLE_NUMBER_RANGE(300,499)

order by

	number

select * from #t

Out of order results returned:



(501 row(s) affected)

(100 row(s) affected)

(50 row(s) affected)

(200 row(s) affected)

number

-----------

500

501

502

... rows 503-995 omitted

996

997

998

999

1000

1

2

3

4

... rows 5-45 omitted

46

47

48

49

50

100

101

102

103

104

... rows 105-194 omitted

195

196

197

198

199

300

301

302

303

304

305

... rows 306-494 omitted

495

496

497

498

499

(851 row(s) affected)

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 13

http://www.sqlmag.com/Article/ArticleID/92888/sql_server_92888.html

Which is what the test with the read uncommitted shows.

If the isolation level is read committed or higher, and SQL has to scan the index it will scan in index page order. Essentially it will find the first page of the index leaf (the pages that stores the lowest index key value) and then will read along the index following the next-page pointer.

The result of that is that the data will be returned from that in order of the index key. What happens to the data later during the query processing is another matter.

In read uncommitted, the storage engine has an extra option. It can use an allocation order scan. That means it doesn't start with the page with the lowest index key, rather it starts with the page earliest in the data file (the one with the lowest PageID). SQL uses the IAM to find the pages that exist in the index and it reads them in the allocation order in the file, completely ignoring the index order.

This is why (in certain cases) a read-uncommitted index scan can miss or double read pages.

If the index is fragmented, then the order of pages in the file doesn't match the order order of page in the index and hence the data comes back 'unordered'

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 14

timothyawiseman (9/5/2008)

Jeff, I must be missing something. Why would you want to do the sort in the gui instead of letting the server do that?

I do understand why you would not want to do it in a view, but I do not see why the gui is superior to doing it in a stored procedure?

Heh... I didn't say the GUI would be superior for sorting and, in fact, it frequently isn't superior. This type of question is always a very difficult question to answer on such limited information. My belief is that the DBA is trying to do the right thing, but may also not be taking other things into consideration...

The problem the DBA has implied is, how would you indicate which sort to do?

Hard coded - Very effective but is like adding ORDER BY to a View... it either limits what it can be used for to a rather specialized scope or the output has to be resorted to get a different sort or you have to have 1 sproc for sort order.

Dynamically defined - Also very effective, but either requires some dynamic SQL or some decision making in the Order By. Either way, it'll be anywhere from "almost as fast" as the hardcoded method to dreadfully slow with a full recompile every time it's called.

Regardless, both methods put an extra load on the server as sorting is a pretty expensive operation and that's probably why the DBA has dug his/her heels in. If the GUI or app makes a lot of calls to the server for sorted data, you end up with a lot of resource usage just sorting data. So, if you can off-load some of that to the client, then why not do it in the app?

Do I think the DBA or the OP is correct in their argument? Dunno... not enough information on scalable performance COST to the server or the COST in dollars to fix, test, and redeploy the app. 😉

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions