Query Tuning

Question

Post reply

Query Tuning

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply

OzYbOi d(-_-)b SSCarpal Tunnel Points: 4054 More actions · Answer 1

OzYbOi d(-_-)b

SSCarpal Tunnel

Points: 4054

December 27, 2011 at 9:45 am

#1425776

Tks Jeff - great question

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 2

Thanks for a great question, Jeff!

I had it wrong, for two reasons:

1. Not knowing that the optimizer is smart enough to rewrite an ISNULL of this form to a SARGable equivalent, and

2. Being too stupid to really check out the indexes given in the question to check if "doesn't include EventID" is a bluff.

So this question learned me one thing I didn't know yet, and attempted to learn me something I should know for a long time already. 😀

EDIT: After looking at the question in yet a bit more detail, I realise that the ability to rewrite the ISNULL is completely irrelevant to the question. The index seek uses StartDate, not EndDate. I'm not saying that the optimizer will not rewrite an ISNULL in some cases, but it doesn't in this case! The seek includes only the StartDate in the Seek Predicate; the predicate on the EndDate is in the (non-seek) predicate of the operator.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 3

Hugo Kornelis (12/27/2011)

EDIT: After looking at the question in yet a bit more detail, I realise that the ability to rewrite the ISNULL is completely irrelevant to the question. The index seek uses StartDate, not EndDate. I'm not saying that the optimizer will not rewrite an ISNULL in some cases, but it doesn't in this case! The seek includes only the StartDate in the Seek Predicate; the predicate on the EndDate is in the (non-seek) predicate of the operator.

So I decided to dig a bit deeper, and I found out that the explanation about the ISNULL is, in fact, incorrect. The optimizer will (at least in this case) NOT rewrite the ISNULL to make it SARGable.

Here's the code I used to test it - using the CREATE TABLE Jeff provided in the question, and the data generator he posted in the explanation.

CREATE INDEX ix1 ON dbo.Event(EndDate) INCLUDE (StartDate)



DBCC FREEPROCCACHE;

DBCC DROPCLEANBUFFERS;

DECLARE @DesiredStartDate  DATETIME,

        @DesiredEndDate    DATETIME,

        @BeginningOfTime   CHAR(8),

        @EndOfTime         CHAR(8);

 SELECT @DesiredStartDate  = '14 Jun 2012',

        @DesiredEndDate    = '16 Jun 2012',

        @EndOfTime         = '99991231';

 SELECT EventID, StartDate, EndDate

   FROM dbo.Event

  WHERE (@DesiredEndDate  <= ISNULL(EndDate,@EndOfTime))

OPTION (RECOMPILE);

I ran the last query with the actual execution plan turned on, and the result was an index SCAN instead of an index SEEK on index dbo.ix1.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 4

440692 I am just a number (12/27/2011)

Could you expand on the explanation for option E

Several people asked this. The answer is that the SARGability of the ISNULL is irrelevant, because the engine will use the first column in the index (StartDate) for the seek, not the second column (EndDate).

In my previous post, I have demonstrated that -at least in this case, with a 1000-row test population-, the optimizer will in fact NOT rewrite the ISNULL to a more SARGable expression. But what I think Jeff had in mind (and he'll undoubtedly correct me if I'm wrong) is that the ISNULL expression could be rewritten as follows:

Step 1: Convert @DesiredEndDate <= ISNULL (EndDate, @EndOfTime) to

@DesiredEndDate <= EndDate

OR (EndDate IS NULL AND @DesiredEndDate <= CAST(@EndOfTime AS datetime))

Step 2: Pre-evaluate the part of the expression that contains only variables:

@DesiredEndDate <= EndDate

OR (EndDate IS NULL AND (True))

Step 3: Remove needless ballast and swap order for readability:

EndDate >= @DesiredEndDate

OR EndDate IS NULL

In an RDBMS that sorts NULL values *after* other values, this can be retrieved using a single seek (for the first qualifying row) + forward range scan (for the rest). But SQL Server orders NULL values first. That makes it a bit harder; in theory, the optimizer could use two seek / forward range scan combinations (one for the NULL values at the start; the other for the non-NULL values) and merge them. Maybe this will indeed happen if the table is large enough to warrant using all phases of the optimizer to do a full optimization. But for a thousand rows, the optimizer will simply scan the index instead.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Steve Cullen SSCertifiable Points: 5598 More actions · Answer 5

Steve Cullen

SSCertifiable

Points: 5598

December 27, 2011 at 3:54 pm

#1425895

Love this question. I got A but missed B.

Converting oxygen into carbon dioxide, since 1955.

Hugo Kornelis SSC Guru Points: 64748 More actions · Answer 6

tabinsc (12/27/2011)

I do have one question: If clustered index columns are auto-appended to nonclustered indexes, what if the nonclustered index is unique? Adding the clustered index column(s) to an unique nonclustered index would change the unique constraint on it, right?

No, it won't. The index is still unique; duplicate values will still be rejected. And these duplicate values will only be checked for the actually indexed columns.

All nonclustered indexes have the columns from the clustered index added - but only on the leaf-level pages, and not as indexed columns but as extra data. These columns are the pointer to where the rest of the data is. Compare it to an index in a book - after each indexed term is a page number (the clustered index value), that you then use to look up the complete data. If you would enforce the (for a study book) silly requirement that each term can be indexed only once, the index will immediately show violations of this requirement, but there still may be multiple index entries referencing the same page.

Technically, the way clustered index columns are appended to nonclustered indexes is very similar to how included columns work.

Hugo Kornelis, SQL Server/Data Platform MVP (2006-2016)
Visit my SQL Server blog: https://sqlserverfast.com/blog/
SQL Server Execution Plan Reference: https://sqlserverfast.com/epr/

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 7

Sorry... bad post withdrawn. Hugo and Paul are correct. I missed my own NOT NULL constraint on the StartDate column. I'm sending an email to SSC to correct the answers without changing the ones that got it correct. :blush:

To be sure, I made a terrible mistake. ISNULL isn't SARGable.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Paul White SSC Guru Points: 150457 More actions · Answer 8

There's always a temptation with QotD to keep expanding it until the question is 'perfect'. This question seems to have fallen victim to that temptation; in attempting to cover many things all at once, clarity suffers, and more importantly so does correctness, in at least a couple of important respects...

Hugo is correct regarding the SARGability of the ISNULL expression. The reason using ISNULL on the StartDate column does not prevent a seek is that the column is constrained to be NOT NULL. The optimizer can, and does, remove redundant ISNULL calls in this specific case during the simplification phase of query optimization. Looking at the query posted with the predicate @DesiredStartDate >= ISNULL(StartDate,0), the execution plan shows the following details for the index seek:

Seek predicate: End: [dbo].[Event].StartDate <= Scalar Operator([@DesiredStartDate])

Residual predicate: [@DesiredEndDate]<=isnull([dbo].[Event].[EndDate],CONVERT_IMPLICIT(datetime,[@EndOfTime],0))

Notice the ISNULL has been removed from the StartDate predicate (due to the column's NOT NULL constraint) resulting in an index-seek operation. Rows returned by the storage engine as qualifying for the seek are subjected to the residual predicate test in the query processor. On one particular run on my machine, the seek qualified 459 rows. The residual, applied to these 459 rows, reduced the total further to 15 rows.

The EndDate column, on the other hand, is not constrained to be NOT NULL, so the ISNULL on the column reference remains and results, in general, in a non-SARGable predicate. Moving back to the original query in the question, notice that the index seek properties are exactly the same as for the ISNULL(StartDate,0) extension noted above:

Seek: End: [dbo].[Event].StartDate <= Scalar Operator([@DesiredStartDate])

Residual: [@DesiredEndDate]<=isnull([dbo].[Event].[EndDate],CONVERT_IMPLICIT(datetime,[@EndOfTime],0))

The second important point here (aside from noting how many eventually-unqualified rows have to go through the residual predicate) is that the seek range scan only has a End: attribute, not a Start: attribute. This means the range scan starts at the very beginning of the index, and terminates when the End: predicate first returns false. To make the point perhaps clearer, add a second index:

CREATE INDEX nc1 ON dbo.[Event] (StartDate DESC, EndDate)

With the leading column sorted descending, the optimizer chooses this index over the one in the question, with these properties:



Seek: Start: [dbo].[Event].StartDate <= Scalar Operator([@DesiredStartDate])

Residual: [@DesiredEndDate]<=isnull([dbo].[Event].[EndDate],CONVERT_IMPLICIT(datetime,[@EndOfTime],0))

The estimated cost of this plan is 0.0045265 versus 0.0074895 for the original index - an improvement of close to 40%. However, the seek still qualifies 459 rows and the residual filters this to 15 as before, the cost reduction is simply a consequence of the optimizer preferring seeks with a Start: prefix over those that scan from one end to an End: prefix. Further, having EndDate part of the key adds nothing, so a better index for this query is:

CREATE INDEX nc2 ON dbo.[Event] (StartDate DESC) INCLUDE (EndDate)

Now the optimizer chooses this new index (because it has the potential to be narrower), although the new plan still has an estimated cost of 0.0045265 in this case (no improvement over the nc1 index).

The nc2 index gives us these seek properties:



Seek: Start: [dbo].[Event].StartDate <= Scalar Operator([@DesiredStartDate])

Residual: [@DesiredEndDate]<=isnull([dbo].[Event].[EndDate],CONVERT_IMPLICIT(datetime,[@EndOfTime],0))

All the index examples quoted here perform FORWARD range scans (BACKWARD scans are only problematic in some kinds of parallel plan).

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Paul White SSC Guru Points: 150457 More actions · Answer 9

For anyone wondering about the cryptic comment in the UPDATE statement:



 UPDATE dbo.Event

    SET EndDate = DATEADD(dd,

                          CASE ABS(CHECKSUM(NEWID()))%4

                          WHEN 0 THEN 1

                          WHEN 1 THEN 1

                          WHEN 2 THEN 1

                          WHEN 3 THEN 7

                          ELSE 14 --Yeah, this will be another QOTD.

                          END,

                          StartDate)

WHERE EventID > 10

This illustrates the problem of using side-effecting non-deterministic functions in a 'simple' CASE expression:

SELECT



    CASE ABS(CHECKSUM(NEWID()))%4

        WHEN 0 THEN 'Zero'

        WHEN 1 THEN 'One'

        WHEN 2 THEN 'Two'

        WHEN 3 THEN 'Three'

        ELSE 'Um, what?'

    END

This code often prints the text from the ELSE clause, because simple CASE is expanded to:

[Expr1001] = Scalar Operator(



CASE WHEN abs(checksum(newid()))%(4)=(0) THEN 'Zero' ELSE

CASE WHEN abs(checksum(newid()))%(4)=(1) THEN 'One' ELSE

CASE WHEN abs(checksum(newid()))%(4)=(2) THEN 'Two' ELSE

CASE WHEN abs(checksum(newid()))%(4)=(3) THEN 'Three' ELSE 'Um, what?' END END END END)

Notice how many times abs(checksum(newid()))%(4) is re-evaluated, probably with a different result from what was intended.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

anders-731262 Ten Centuries Points: 1256 More actions · Answer 10

anders-731262

Ten Centuries

Points: 1256

December 28, 2011 at 12:52 am

#1425971

Excellent! Question covers a lot.

Thanks

Paul White SSC Guru Points: 150457 More actions · Answer 11

Hugo Kornelis (12/27/2011)

...the ISNULL expression could be rewritten [...] maybe this will indeed happen if the table is large enough to warrant using all phases of the optimizer to do a full optimization. But for a thousand rows, the optimizer will simply scan the index instead.

Hi Hugo,

That is a possible transformation, but not one the optimizer currently considers. If it existed, SQL Server would likely use a 'dynamic seek' pattern, rather than a single seek, as shown below:



ALTER TABLE dbo.Event ALTER COLUMN StartDate DATETIME NULL

SELECT

    e.EventID,

    e.StartDate,

    e.EndDate

FROM dbo.[Event] AS e

WHERE

    e.StartDate = '2012-06-14'

    OR e.StartDate IS NULL

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

paul.knibbs SSCoach Points: 15320 More actions · Answer 12

paul.knibbs

SSCoach

Points: 15320

December 28, 2011 at 2:41 am

#1425994

The weird thing is, I selected the first two answers because I wanted to see the *real* answer, and ended up getting the question right more or less by default... :ermm:

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 13

SQL Kiwi (12/28/2011)

For anyone wondering about the cryptic comment in the UPDATE statement:

UPDATE dbo.Event SET EndDate = DATEADD(dd, CASE ABS(CHECKSUM(NEWID()))%4 WHEN 0 THEN 1 WHEN 1 THEN 1 WHEN 2 THEN 1 WHEN 3 THEN 7 ELSE 14 --Yeah, this will be another QOTD. END, StartDate) WHERE EventID > 10

This illustrates the problem of using side-effecting non-deterministic functions in a 'simple' CASE expression:

SELECT
CASE ABS(CHECKSUM(NEWID()))%4 WHEN 0 THEN 'Zero' WHEN 1 THEN 'One' WHEN 2 THEN 'Two' WHEN 3 THEN 'Three' ELSE 'Um, what?' END

This code often prints the text from the ELSE clause, because simple CASE is expanded to:

[Expr1001] = Scalar Operator(
CASE WHEN abs(checksum(newid()))%(4)=(0) THEN 'Zero' ELSE CASE WHEN abs(checksum(newid()))%(4)=(1) THEN 'One' ELSE CASE WHEN abs(checksum(newid()))%(4)=(2) THEN 'Two' ELSE
CASE WHEN abs(checksum(newid()))%(4)=(3) THEN 'Three' ELSE 'Um, what?' END END END END)

Notice how many times abs(checksum(newid()))%(4) is re-evaluated, probably with a different result from what was intended.

Heh... there goes THAT question.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Paul White SSC Guru Points: 150457 More actions · Answer 14

Paul White

SSC Guru

Points: 150457

December 28, 2011 at 4:43 am

#1426024

Jeff Moden (12/28/2011)

Heh... there goes THAT question.

It's always worth re-doing, it catches a lot of people out (and no-one reads these comments anyway :-))

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 15

My most sincere apologies folks. Hugo and Paul are correct. I missed my own NOT NULL constraint on the StartDate column. I'm sending an email to SSC to modify the answers without changing the ones that got it correct. :blush:

To be sure, I made a terrible mistake. [font="Arial Black"]ISNULL isn't SARGable.[/font]

It doesn't, however, change the utility of being able to use an ISULL on the second column of a StartDate/EndDate paired lookup and still get an INDEX SEEK out of the code.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions