Sanity Check - using CASE in WHERE clauses

Question

Post reply

Sanity Check - using CASE in WHERE clauses

Rayven

SSCarpal Tunnel

Points: 4337
More actions
January 17, 2012 at 4:40 am

#249631

I'm more after a sanity check before I plough on and do something incredibly stupid :hehe:

We frequently have to code procedures where parameters are optional, i.e. If a parameter is supplied with a value then it filters one of then joined tables, it it is zero then it does not filter the table.

We used to code these using WHERE clauses such as

WHERE (@param = 0 OR [field] = @param)

However we sometimes found that we could get bad execution plans, especially where the @param=0 would return huge record sets and [field] = @param would return only a handful of records. Depending on which one SQL same frequently a plan was generated which then was dreadfully slow if the filter was supplied the opposite way.

As a result, most of the poorly performing procedures were either coded to be dynamic SQL which could simply omit the filter if required and EXEC sp_executesql it instead. This caused a host of problems where queries were large in that it was so easy to make typos that snuck through testing occasionally - don't start lecturing on testing please, it happens 😀 )

The other option we headed for was coding separate procedures, with and without the filter, and having a main stored procedure that could analyse the parameters and call whichever procedure was needed. This wasn't always appropriate where we had several combinations of parameters.

Recently I tried something else out of curiosity, and instead of coding the WHERE clause as above, I tried a different method

WHERE (@param = 0 OR [field] = @param)

became

WHERE 1 = CASE WHEN @param = 0 THEN 1 WHEN [field] = @param THEN 1 ELSE 0 END

I didn't expect it to work, but it did and the results I am seeing have surprised me. I've run numerous tests with large and small data sets, cleaned the procedure cache to ensure I got new plans, compares the IO stats and plans and these not only seem to be running faster, reading less data pages and records, but are also seem to be generating more consistent execution plans.

Am I just being lucky, or does anyone know if this would be a more effective long term solution if tests continue to show the same results.

Answers on a postcard please.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

Viewing 15 posts - 1 through 15 (of 17 total)

You must be logged in to reply to this topic. Login to reply

Cadavre SSC-Forever Points: 41690 More actions · Answer 1

Gail wrote a blog on the issue you're describing[/url].

Dynamic is the best way to go (or RECOMPILE) - and more vigorous testing! 😛

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 2

Grief, were you sitting on the forum!!! That was an incredibly fast response - thanks! :w00t:

I think the word I'm looking for is "bugger". I was really hoping that I'd found a way I could get away from having to write Dynamic SQL for the procedures. Some of the query code is incredibly complex and it is so easy when making a change because the company has changed their business rules (seems to happen here by the hour) to get something horribly wrong or have to try and unmangle the code.

In my heart I didn't believe that this would work, but the difference in the execution plans and IO stats caught me off-guard. Running the query with different parameters using the OR clause and clearing the cache each run clearly showed it was using different plans each time depending on the parameters; however with the CASE I was getting faster results and the same plan every time which is what was baffling me.

Looks like i may just have been lucky this time. I'll stick with the dynamic SQL.

Thanks for the incredibly speedy reply.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

Cadavre SSC-Forever Points: 41690 More actions · Answer 3

Rayven (1/17/2012)

Grief, were you sitting on the forum!!! That was an incredibly fast response - thanks! :w00t:

Just lucky 😀

As I said, if you're using 2008 then you can get similar results with the RECOMPILE hint (fairly certain Gail mentions it in her blog post)

Forever trying to learn
My blog - http://www.cadavre.co.uk/
For better, quicker answers on T-SQL questions, click on the following...http://www.sqlservercentral.com/articles/Best+Practices/61537/
For better, quicker answers on SQL Server performance related questions, click on the following...http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

HowardW One Orange Chip Points: 29109 More actions · Answer 4

HowardW

One Orange Chip

Points: 29109

January 17, 2012 at 4:59 am

#1434510

Although it has its downsides, if I was running 2008 SP2 or above, I'd tag an OPTION(RECOMPILE) onto the end of the statement unless it was a very high volume stored procedure where I was worried about how expensive the plan creation would be...

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 5

Rayven (1/17/2012)

Running the query with different parameters using the OR clause and clearing the cache each run clearly showed it was using different plans each time depending on the parameters; however with the CASE I was getting faster results and the same plan every time which is what was baffling me.

I can believe that. The @variable IS NULL or column = @variable is about the worst of the 'catch all' query forms. It produces erratic plans. Most of the others (including, I'm going to guess that case form), just do table or index scans. Hence they're consistent and often faster than a bad @variable IS NULL or column = @variable plan, but still they're likely scans, not seeks.

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 6

I don't want to use the option RECOMPILE. These procedures are almost always driving the websites and get hit magnitudes more than our in-house apps, so it is going to spend more time recompiling that it is worth, and I'm concerned about performance when it comes to the websites.

With reference to table scans, this is just what I was expecting to see, but in all cases it was happily using index seeks :crazy:.

I still agree with Gail that dynamic SQL is going to be far safer. Whilst these tests do seem to be showing better performance I know from rather painful experience what can happen when a plan handling this type of query goes bad - taking for less than 1 second to return data to taking over 5 minutes - just because SQL decided to recompile the plan and opted for table scans instead.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 7

I thought I'd do some quick testing. SQL 2008 SP3. AdventureWorks DB.

CREATE PROCEDURE SearchHistoryCase



(@Product int = NULL, @OrderID int = NULL, @TransactionType char(1) = NULL, @Qty int = NULL)

AS

SELECT ProductID, ReferenceOrderID, TransactionType, Quantity,

TransactionDate, ActualCost from Production.TransactionHistory

WHERE CASE WHEN ProductID = @Product THEN 1 WHEN @Product IS NULL THEN 1 ELSE 0 END = 1

AND CASE WHEN ReferenceOrderID = @OrderID THEN 1 WHEN @OrderID Is NULL THEN 1 ELSE 0 END = 1

AND CASE WHEN TransactionType = @TransactionType THEN 1 WHEN @TransactionType Is NULL THEN 1 ELSE 0 END = 1

AND CASE WHEN Quantity = @Qty THEN 1 WHEN @Qty is NULL THEN 1 ELSE 0 END = 1

GO

First execution:

EXEC SearchHistory @Product = 978, @TransactionType = 'W' -- 208 rows

Table 'TransactionHistory'. Scan count 1, logical reads 1823, physical reads 0.

SQL Server Execution Times:

CPU time = 31 ms, elapsed time = 58 ms.

Pretty decent and it does an Index scan and a key lookup.

Next execution (no recompile, because you won't be recompiling on every use in the live system and it is NOT recompiles that cause the problems with these query forms, it's reuse of inappropriate plans)

EXEC SearchHistory @Qty = 100 --116 rows

Exactly the same plan, but...

Table 'TransactionHistory'. Scan count 1, logical reads 1390863, physical reads 0.

SQL Server Execution Times:

CPU time = 749 ms, elapsed time = 809 ms.

Err.....

If you clear the plan cache before you do each test (as you said you were) you won't see the effect of these. It's as if you were recompiling on every execution. It's the reuse of inappropriate plans that kills performance with these, not the compilation of a not-so-great plan

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 8

Gilamonster: I was expecting to see exactly what you showed, and it was the fact that I wasn't that was causing me confusion. I tried the procedure without clearing the cache between runs and different parameters expecting to see absolutely disastrous results, table scans, index scans, etc.; when I didn't I decided to dig deeper. That was when I started clearing the cache before comparing results and plans between runs. At this point I was expecing to see different plans according to the parameter sets was using.

I agree with you in the fact that what you show is what I have seen many times before when ironing out performance problems with existing code. The fact that I wasn't seeing what I expected here started me scratching my head. I reckon that I've just been lucky with this one in the fact that SQL is obviously seeing indexes and conditions enough to standardise it (unsual I know).

The query I'm working with doesn't lend itself well to dynamic SQL or suites of procedures which was why I was looking for an alternative, usually I would have headed opted directly for dynamic sql.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

Gail Shaw SSC Guru Points: 1004494 More actions · Answer 9

Just note that seeing a seek in a plan does not necessarily mean it's an efficient seek. Check what's been evaluated as a seek predicate (what SQL navigates the b-tree to find) and what it evaluates as a predicates (secondary filter)

Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

We walk in the dark places no others will enter
We stand on the bridge and no one may pass

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 10

Rayven

SSCarpal Tunnel

Points: 4337

January 17, 2012 at 6:02 am

#1434546

Good point, that had slipped my mind. Thanks.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 11

I have a similar situation that I'm working on right now. The parameter I'm working with doesn't hit just one column in a where clause though. It's a search parameter that looks at 5 columns across 5 different tables, all unioned together. Dynamic SQL would be pretty difficult to pull off since the stored procedure is so long and complex and I wanted to avoid using RECOMPILE because of how often this procedure gets called.

I have read/heard that breaking this one stored procedure into 3 may help with the execution plans, but I have yet to test it. The main procedure would do a conditional test to see if the parameter is NULL or not and execute one stored procedure if it is and a different procedure if it isn't.

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 12

Breandan: This has work very well for me in the past where we have been able to do it.

In our situation we had a complex query that was required to return data but the application could either supply a specific person id code, their last name or their employer name (and return a set of delegates).

We coded the base query which worked perfectly and then created 3 stored procedures, each one running the query with the parameter required. A main query then took in all three parameters defaulting them to NULL and checked to see which ones were set - calling the required procedures.

This creates a different execution plan for each procedure - and saved us having to convert the query into a dynamic sql string with all the permissions issues it entails.

The result was something like

CREATE PROCEDURE [stp_mainproc] @personid INT = NULL, @lastname VARCHAR(100) = NULL, @employer VARCHAR(100) = NULL

AS

SET NOCOUNT ON;

IF @personid IS NOT NULL

EXEC [stp_mainproc_person] @personid;

ELSE IF @lastname IS NOT NULL

EXEC [stp_mainproc_lastname] @lastname;

ELSE IF @employer IS NOT NULL

EXEC [stp_mainproc_employer] @employer

GO

It also has the advantage that you only have to assign the permissions to [stp_mainproc] as well.

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.

brendan woulfe Hall of Fame Points: 3888 More actions · Answer 13

brendan woulfe

Hall of Fame

Points: 3888

January 17, 2012 at 9:25 am

#1434688

Rayven - thanks for the input. I'm going to give it a try over the next couple of days and see how it works. You said there's some complexities in your current requirement that won't allow you to do something similar?

Rayven SSCarpal Tunnel Points: 4337 More actions · Answer 14

Brendan: The only thing which makes multiple procedures for parameters a pain is where I have 5 or 6 optional parameters with multiple combinations - it would mean hundreds of procedures, lol. 😀

---------------------------------------
It is by caffeine alone I set my mind in motion.
It is by the Beans of Java that thoughts acquire speed,
the hands acquire shaking, the shaking becomes a warning.
It is by caffeine alone I set my mind in motion.