MIN() and MAX() issues

Question

Post reply

MIN() and MAX() issues

SQLBill

SSC Guru

Points: 51440
More actions
September 7, 2006 at 9:40 am

#114415

I've run into a curiosity. While creating a script to correlate some data, I was trying to use MIN() and MAX() on a identity column based on values in a datetime column. Using MAX(), I got the results back in six seconds (00:00:06). Using MIN(), I got the results back in 8 minutes 43 seconds (00:08:43). Using the Estimated Execution Plan, I found this:

The MAX() query first does a Clustered Index SEEK (98%), then a TOP (2%), then a Stream Aggregate (0%).

The MIN() query first does a Clustered Index SCAN (100%), followed by Parallelism/Gather Stream (0%), then a TOP (2%), then a Stream Aggregate (0%).

The queries are:

SELECT MAX(RowID)

FROM mytable

WHERE dt_column = '2006-09-06 11:00'

SELECT MIN(RowID)

FROM mytable

WHERE dt_column = '2006-09-06 11:00'

The RowID column is a clustered index and the dt_column is a non-clustered index.

I can make the second query faster by changing it to a MAX() and using 2006-09-06 10:59:59 (acceptable for the need), but that doesn't explain why one uses a scan and parallelism/gather stream and the other uses a seek and no parallelism.

Any thoughts on this?

(I can't change the schema and that's not the point of this, I'm more interested in why the difference execution between the two.)

-SQLBill

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply

Roy Rogers SSCertifiable Points: 7712 More actions · Answer 1

Roy Rogers

SSCertifiable

Points: 7712

September 7, 2006 at 9:59 am

#659140

I have similar table with int column clustered index and date column non clusterd index with about 5 million rows. With are without date criteria it returns data in millseconds.

And both uses clustererd index seek.

Regards,
gova

SQLBill SSC Guru Points: 51440 More actions · Answer 2

SQLBill

SSC Guru

Points: 51440

September 7, 2006 at 11:20 am

#659168

My table has 194 million rows. But I don't think the size should be causing MIN() and MAX() to act differently.

-SQLBill

Out For Justice SSCrazy Points: 2733 More actions · Answer 3

Out For Justice

SSCrazy

Points: 2733

September 8, 2006 at 2:45 am

#659290

I wonder if it could be caused by a badly fragmented index? Perhaps because of the physical position of the max records on the disk it is able to find them quicker?

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 4

Kenneth Wilhelmsson

SSC-Dedicated

Points: 30043

September 8, 2006 at 8:53 am

#659358

The thing about parallellism is that we have no control over if and when it gets used.

It seems like the MAX() didn't get any and the MIN() got some. Perhaps a coincidence, and perhaps of no importance, but you never now..

Try to run both with parallellism turned off and see if the plans still are different?

/Kenneth

SQLBill SSC Guru Points: 51440 More actions · Answer 5

Out for Justice,

I had checked the fragmentation and it was slightly fragged. So I did a INDEXDEFRAG and ran it with the results I posted.

Kenneth,

It's been repeatable. MAX() doesn't use parallelism, MIN() does. This has been happening for a while, but the result time wasn't enough to 'dig' into it. But this was so large (1 1/2 hours to return less than 20,000 rows) that I started checking it out more in depth.

If I get a chance I will try it with the parallelism off.

-SQLBill

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 6

I ran your original examples on my production box... both execution plans are identical and they both have parallelism.

Could it be that someone screwed up and turned of auto-calc of statistics or that no regular maintenance plan is in effect?

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

SQLBill SSC Guru Points: 51440 More actions · Answer 7

SQLBill

SSC Guru

Points: 51440

September 11, 2006 at 8:36 am

#659665

Nope, I have auto-statistics on (I also run UPDATEUSAGE frequently) and I do regular maintenance. In fact, I had just finished defragging the indexes and had updated usage and stats.

-SQLBill