TSQL Faster Than Stored Procedure?

Question

Post reply

TSQL Faster Than Stored Procedure?

Chris Stamey

SSCrazy

Points: 2635
More actions
February 27, 2007 at 8:02 am

#153455

SQL 2000, SP3A.

I set up a view into our HR database for another app to query employee information. This view is necessary because the other app is only allowed to see certain information in this highly secure area.

The TSQL query to return data from this view works in less than one second, but the same TSQL in a stored procedure, with one argument, takes 12 seconds to run, every time. Adding recompile to the procedure did not make a difference, neither did replacing the argument with a new declared var inside the procedure, as Maddogs suggested in another thread.

View Source:

CREATE VIEW dbo.vwlaEmployeeListVPS

AS

/*

For VPS to obtain a list of employees.

Version 1.01, 12-14-2006, Chris Stamey

*/

SELECT     TOP 100 PERCENT EB.EbFlxID As EmployeeListID, EB.EbClock AS EmpNumber, EB.EbPSID As PSID, EB.EbFirstName As FirstName,

EB.EbLastName As LastName, EP.EpEMail As EmailAddress, EJ.EjWorkPhone As WorkPhone, EJ.EjDivision AS UnitNumber, EE.EeStatus AS EmploymentStatus,

EJ.EjRetired AS CompLevelCode, EE.EeTermDate AS TerminationDate, EJ.ejTitle As JobTitle,

                      EJ.EjSupervisorFlxIDeb AS SupervisorFlxID

FROM         dbo.EBase AS EB INNER JOIN

                      dbo.EEmploy AS EE ON EB.EbFlxID = EE.EeFlxIDEb INNER JOIN

                      dbo.EJob AS EJ ON EB.EbFlxID = EJ.EjFlxIDEb INNER JOIN

                      dbo.EPerson AS EP ON EB.EbFlxID = EP.EpFlxIDeb

WHERE     (EP.EpDateBeg =

                          (SELECT     MAX(EpDateBeg) AS Expr1

                            FROM          dbo.EPerson

                            WHERE      (EpFlxIDEb = EB.EbFlxID))) AND (EJ.EjDateBeg =

                          (SELECT     MAX(EjDateBeg) AS Expr1

                            FROM          dbo.EJob

                            WHERE      (EjFlxIDEb = EB.EbFlxID))) AND (EE.EeDateBeg =

                          (SELECT     MAX(EeDateBeg) AS Expr1

                            FROM          dbo.EEmploy

                            WHERE      (EeFlxIDEb = EB.EbFlxID))) AND (EE.EeTermDate > '8/1/2006' OR

                      EE.EeTermDate IS NULL)

ORDER BY EB.EbLastName, EB.EbFirstName

TSQL that runs in 1 second:

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

    ,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = 'psridhar@landam.com'

Procedure (runs in 12 seconds):

VPA_UserInfo_Retrieve 'psridhar@landam.com'

Procedure Source:

CREATE PROCEDURE [dbo].[VPA_UserInfo_Retrieve]

@NetworkID NVARCHAR(100) With recompile

AS

BEGIN

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

    ,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

END

GO

I have noticed if I pass the email address as a parameter to the TSQL, it slows down tremendously also (12 seconds):

Declare @NetworkID NVARCHAR(100)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

    ,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

When I looked at the query plans I saw that the indexes were used/not used differently. When it runs fast the one index it uses, and says was 99% of the execution time, says it returns 1 row. In the slow versions it says it uses the same index in 2 places, and each of those is 23% of the execution time and the rows returned are over 228,000. There is another index involved that has rows returned of 148,000, too.

I don't understand why using and argument/variable instead of a literal can make that much difference in a query plan. Is SQL Server that dumb? What am I missing?

Thanks,

Chris

Viewing 15 posts - 1 through 15 (of 28 total)

You must be logged in to reply to this topic. Login to reply

Ken McKelvey SSCoach Points: 18484 More actions · Answer 1

Ken McKelvey

SSCoach

Points: 18484

February 27, 2007 at 8:13 am

#692054

What is the datatype of EmailAddress? varchar(100) or nvarchar(100)?

Chris Stamey SSCrazy Points: 2635 More actions · Answer 2

The epEmail field is VarChar(255), in the table behind the view. I have tried it both ways, nvarchar and varchar, and that doesn't make much difference.

12 seconds:

Declare @NetworkID VARCHAR(255)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

9 seconds:

Declare @NetworkID NVARCHAR(255)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

9 seconds:

Declare @NetworkID NVARCHAR(100)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

12 seconds:

Declare @NetworkID VARCHAR(100)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

Thanks,

Chris

Ken McKelvey SSCoach Points: 18484 More actions · Answer 3

Ken McKelvey

SSCoach

Points: 18484

February 27, 2007 at 8:35 am

#692068

@NetworkId should be varchar(255) to avoid datatype precedence issues.

Putting order into a view is not a good idea. Try removing the TOP 100 PERCENT and ORDER BY from the view. An ORDER BY can then be added to the query.

Chris Stamey SSCrazy Points: 2635 More actions · Answer 4

OK, done. Removed ordering and the Top 100 percent from the view. Ran the following, but still 12 seconds:

Declare @NetworkID VARCHAR(255)

Set @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

Thanks,

Chris

Ken McKelvey SSCoach Points: 18484 More actions · Answer 5

Ken McKelvey

SSCoach

Points: 18484

February 27, 2007 at 8:51 am

#692077

Sorry, that exhausts my suggestions.

AMIT GUPTA-263376 SSCarpal Tunnel Points: 4244 More actions · Answer 6

AMIT GUPTA-263376

SSCarpal Tunnel

Points: 4244

February 28, 2007 at 4:15 am

#692260

Hi ,

First Clear ur procedure cache...

using THIS STATEMENT...

DBCC FREEPROCCACHE

and then re run ur view.. if it is fine then ur network who acess the database will be slower.

And its true that an application takes lots of time to acess data from a database??????

Regards,

Amit G.

Jeff Moden SSC Guru Points: 1000539 More actions · Answer 7

Ummm.... take the ORDER BY out of the view... you should do the order by in procs that reference the view, not in the view itself.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
"Change is inevitable... change for the better is not".

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)
Intro to Tally Tables and Functions

Chris Stamey SSCrazy Points: 2635 More actions · Answer 8

Amit, using that statement to clear the proc cache made the TSQL much worse. The same thing, TSQL, with variables passed to it, took 26 seconds to run! The stored procedure execution time did not change. The TSQL with the literal arguments still runs in 1 second.

The view in question takes 10 seconds to return 16,500 rows, if no criteria is specified, and is no slower if the Order By is used. As can be seen by the times I am getting with this query, the criteria make it slower when passed in as a variable, returning fewer records.

Jeff, already removed the order by from the view, as per suggestion by Ken.

Thanks for the ideas guys,

Chris

Ken McKelvey SSCoach Points: 18484 More actions · Answer 9

Ken McKelvey

SSCoach

Points: 18484

February 28, 2007 at 11:24 am

#692374

Two more thoughts for the view:

1. If you have indexes on the Foreign Keys, you could try explicitly using INNER MERGE JOIN instead of INNER JOIN.

2. I tend to avoid correlated sub-queries as they can be slow. Try joining to derived tables instead.

kevin77 SSCarpal Tunnel Points: 4041 More actions · Answer 10

Believe it or not, Chris, all database engines are "that dumb". I first encountered this phenomenon over 5 years ago and thought the same thing, "Is SQL Server really that stupid?" I never really got an answer to my question, but learned to accept it.

Then, a few months later I was in a meeting with a bunch of developers from other departments and one person was talking on the side about a simple query he had, that when executed with a variable ran significantly slower than when executed with a hard coded value. He was using IBM DB2 database.

Another year went by and I switched jobs to a company that used Oracle. Guess what, Oracle has the same issue.

So really, to answer your question, no, SQL Server isn't stupid. It has to do with the optimizer and what information it has when it compiles your statement. Loosly speaking, it has to do with cardinality and other statistics information. When a SQL statement is compiled and an execution plan is made, all avaiable information is looked at to determine the most optimum plan. If you have a variable in the statement, then obviously the compiler cannot know the value at compile time, so it takes a more general approach at the execution plan.

The easiest example would be this. Suppose you have a table and an index on a boolean column. However, the table is heavily weighted such that 99% of the rows have a value of 'true' for that boolean column and just 1% of the rows have a value of 'false'.

If you put together a SELECT statement like the following:

SELECT * FROM boolTable WHERE boolColumn = false;

When the optimizer compiles this statement, it will know upfront (assuming recents statistics on the table) that the table is heavily weighted and that using the index on the column to select all rows with a value of 'false' is the most efficient.

Now, let's say you change the query and have a hard coded 'true' value in there:

SELECT * FROM boolTable WHERE boolColumn = true;

Again, the optimizer knows everything upfront and this time, it will elect to do a full table scan because it knows it is going to return 99% of the rows, so a full table scan will be faster than using the index.

Lastly, if you changed the statement to use a variable:

SELECT * FROM boolTable WHERE boolColumn = @boolValue;

The optimizer doesn't know what best to do, so it will take a general approach, which may or may not be the 'right' one.

Hopefully that kind of clears it up for you.

Chris Stamey SSCrazy Points: 2635 More actions · Answer 11

Kevin77, that kind of makes sense, but wouldn't the optimizer, since it knows what the field is I am going to be comparing to, even though it doesn't yet know the value, say that this field has all different values in it, but I know how I will handle it?

In this case it's an email address field.

Ken, are you saying I should joiin on my sub-queries, in such a manner as this:

From Table T

Inner Join (Select SubField From Table Where OtherField Is Null) S On T.Field = S.SubField

I have seen this done, but wondered about its efficiency.

Now, an update. My MS guy was here and noticed that on the slow query it are several instances, in the execution plan, of parallelism going on so he suggested putting in the query option Option(Maxdop 1). This sped it up to 5 seconds on the first run and 1 secodn on a subsequent run, but the TSQL with literal where clause still beats it. I have a case open with MS now, and hope to track this problem down definitively.

Thanks for the help,

Chris

Malcolm Wynden Say Hey Kid Points: 678 More actions · Answer 12

Using a view in a stored proc that is then filtered based on a parameter value from the stored proc "encourages" SQL Server to generalize the solution - ie fully instantiated the view, then cache it. The first time you hit it, if it doesn't have a cached version of the view the optimizer feels it can safely use, then you will get a significant performance hit.

A simple way to avoid this is to avoid using views in stored procs. For example,

CREATE PROCEDURE dbo.EmpInfo @NetworkID VARCHAR(255)

AS

SELECT

EB.EbFirstName As FirstName

, EB.EbLastName As LastName

, EB.EbClock AS EmpNumber

, EJ.EjDivision AS UnitNumber

, EP.EpEMail As EmailAddress

, EJ.EjWorkPhone As PhoneNumber

FROM dbo.EBase AS EB

JOIN dbo.EEmploy AS EE

ON EB.EbFlxID = EE.EeFlxIDEb

JOIN dbo.EJob AS EJ

ON EB.EbFlxID = EJ.EjFlxIDEb

JOIN dbo.EPerson AS EP

ON EB.EbFlxID = EP.EpFlxIDeb

WHERE EP.EpEMail = @NetworkID

AND EE.EeStatus = 'Active'

AND COALESCE(EE.EeTermDate,'9999-12-31') > '8/1/2006')

AND EP.EpDateBeg =

(SELECT MAX(EpDateBeg) AS Expr1

FROM dbo.EPerson

WHERE EpFlxIDEb = EB.EbFlxID)

AND EJ.EjDateBeg =

(SELECT MAX(EjDateBeg) AS Expr1

FROM dbo.EJob

WHERE EjFlxIDEb = EB.EbFlxID)

AND EE.EeDateBeg =

(SELECT MAX(EeDateBeg) AS Expr1

FROM dbo.EEmploy

WHERE EeFlxIDEb = EB.EbFlxID)

SQL Server will then optimize performance more intelligently, because it doesn't have to populate a view, then throw all but one record away. This is the kind of performance you see when you pass a fixed (non variable) value. Interestingly, when you do this directly from a T-SQL query against a view, the optimizer is smart enough not to instantiate the complete view, but attempts to save work by reducing instantiating the number of rows AND columns in the view to the ones your query wants. Imperfect intelligence sometimes, in that it does more work than needed, but I'm guessing this is why your query performs so well in T-SQL with a fixed value, but not when using a variable.

Chris Stamey SSCrazy Points: 2635 More actions · Answer 13

Chris Stamey

SSCrazy

Points: 2635

February 28, 2007 at 12:59 pm

#692407

Thanks for the insite Malcom. In this case, for security purposes, I must use the view, as the developers and the application cannot have access to other fields in the view tables. I am not opposed to optmizing the view more though. I am experimenting with that now.

Thanks,

Chris

Chris Stamey SSCrazy Points: 2635 More actions · Answer 14

Kevin, taking your suggestion and putting the sub-queries into joins, I sped up the return of all records, 16,500, by 25%, Initial 8 seconds and subsequent 6 seconds, vs the old 10 second and subsequent 8 seconds. Query used: Select * from vwlaEmployeeListVPS

With the other query, I found no difference in the return times. Query used:

Declare @NetworkID VARCHAR(255)

Select @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS2 WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

Now, using either view along with these query optimizers, seen below, I got the speed down to 3 seconds, under repeated runs. Query used:

Declare @NetworkID VARCHAR(255)

Select @NetworkID = 'psridhar@landam.com'

SELECT FirstName as FirstName,LastName as LastName,EmpNumber as UserID,UnitNumber as UnitNumber,EmailAddress as EmailAddress

,WorkPhone as PhoneNumber FROM VHRDevTest.dbo.vwlaEmployeeListVPS2 WHERE EmploymentStatus = 'Active' and EmailAddress = @NetworkID

Option(Maxdop 1, FAST 1)

I just found something that makes it even faster, using FAST 100 instead of FAST 1. This increased the speed to 2 seconds rather than 3. I tried it with both views and got the same performance increase.

Thanks,

Chris