Twenty tips to write a good stored procedure

  • I saw Gail's answers and I think she hit the ones I was going to hit, but I'm feeling the need to pile on a bit.

    #8: KEEPFIXED PLAN can prevent some recompiles, but not all, but, remember, recompiles aren't necessarily bad. sp_executesql is not a mechanism for preventing recompiles. It's a mechanism for promoting plan reuse, a completely different topic. Your average EXEC 'somestring' doesn't suffer from recompiles. It suffers from multiple compiles, and no reuse. That's just two different options. But even the use of the hint needs to be very carefully weighed because, again, recompiles are not necessarily bad.

    #11 WHOA! Bad choice. A function on a column like that will prevent an index on that column from ever getting used. Better to reverse it and set the variable to max or min or whatever, or just let it run (if the case insensitive coalation is in use, it's unecessary).

    #13 I agree with the idea of limiting the use of DISTINCT. However, ORDER BY can be a very inexpensive operation that can, in some ways benefit performance. Do a comparison between getting the MAX of a value and getting the TOP(1) of a value where an index is available for the TOP(1) operation.

    #16 Huh?

    #18 Wrong. Temp tables and table variables make use of tempdb and memory in the same way. The difference is statistics, which is why the table variable doesn't cause recompiles. But that same difference makes table variables a very weak choice when dealing with large amounts of data.

    #19 Index scans are not necessarily faster than table scans. It honestly depends. This doesn't change the tip, use proper indexes, which is correct, but as an explanation, it's weak.

    #20 Can we rephrase this to use traces? Profiler itself causes problems when connected to a production system directly.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • GilaMonster (8/10/2009)


    ta.bu.shi.da.yu (8/10/2009)


    Strange as it may seem, if you run the Oracle Automatic Workload Repository ADDM report then you'll often get Oracle warning not to use the operator. It seems that they say this because it does a full scan against the tables, whereas if you do something like number > 0 and number 0 OR SomeColumn < 0) and seek on an appropriate index. It's still two partial scans of the table but, if there are lots of cases where SomeColumn=0, it's likely better than a full scan of the index or table (likely, because this can be a bit of an edge case as to whether seeks which do partial scans are better than full scans)

    I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used? I ask because, tucked in the recesses of my mind, I remember some time ago seeing an obscure SQL Server 2000 performance bug that caused it to understimate the cardinality of certain expressions that used an inequality or range operator.

    Random Technical Stuff[/url]

  • ta.bu.shi.da.yu (8/10/2009)


    I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used?

    Possibly. Selectivity's difficult with inequalities, especially when there's parameters or variables in the mix.

    Probably doesn't answer your question, but ... http://sqlinthewild.co.za/index.php/2009/02/06/index-columns-selectivity-and-inequality-predicates/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yep, Gail wins the timezone war! She is spot on too, as usual.

    I am also surprised this article got the green light.

    I hate it when I'm late to the game.

    The only little bits I would add are:

    Order of operation can help speed a little bit, since you stop checking once of the conditions is met, kinda like the EXISTS issue, but you would have to be using A LOT of them to see any real difference, you may run into the optimizer picking a simple plan before this issue even comes up I'd have to do some testing to prove that out though.

    Table variables return 1 row on estimated plans because they don't have any stats on them. A # table gets stats generated just like any other table.

    DDL to the top of the proc, not dispersed through the procedure helps, the partial recompile work better, so even if you can't avoid a recompile you may not have to recompile the more complicated parts of your proc.

  • Hi,

    Some good suggestions raised here. Another common performance tip is the use of UNION ALL in preference to UNION statement.

    Provided you can guarantee that all the rows will be unique from each select rowset of the union query then you do not need expense of SQL to perform an unnecessary elimination of duplicated rows performed by UNION statement.

    Cheers

    Terry

  • ta.bu.shi.da.yu (8/10/2009)


    I'm curious... would the selectivity of an index influence the execution plan when an inequality predicate is used? I ask because, tucked in the recesses of my mind, I remember some time ago seeing an obscure SQL Server 2000 performance bug that caused it to understimate the cardinality of certain expressions that used an inequality or range operator.

    Yes.

    CREATE TABLE #Employee (emp_id INT NOT NULL, padding VARCHAR(MAX) NULL);

    CREATE INDEX nc1 ON #Employee (emp_id);

    INSERT #Employee SELECT TOP (2500) ROW_NUMBER() OVER (ORDER BY C1.[object_id]), REPLICATE('X', 1000) FROM master.sys.all_columns C1, master.sys.all_columns C2

    SELECT * FROM #Employee E WHERE emp_id 0 -- Table scan

    -- |--Table Scan(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), WHERE: ([tempdb].[dbo].[#Employee].[emp_id] as [E].[emp_id](0)))

    TRUNCATE TABLE #Employee

    INSERT #Employee SELECT TOP (2500) 0, REPLICATE('X', 1000) FROM master.sys.all_columns C1, master.sys.all_columns C2

    SELECT * FROM #Employee E WHERE emp_id 0 -- Index seek

    --|--Nested Loops(Inner Join, OUTER REFERENCES: ([Bmk1000]) OPTIMIZED)

    -- |--Compute Scalar(DEFINE: ([Expr1003]=BmkToPage([Bmk1000])))

    -- | |--Index Seek(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), SEEK: ([E].[emp_id] (0)) ORDERED FORWARD)

    -- |--RID Lookup(OBJECT: ([tempdb].[dbo].[#Employee] AS [E]), SEEK: ([Bmk1000]=[Bmk1000]) LOOKUP ORDERED FORWARD)DROP TABLE #Employee

    DROP TABLE #Employee;

    edit: fixed the colon plus bracket smileys 🙂

  • rja.carnegie (8/10/2009)


    Is there a reason besides less typing, distinction between DML and variable operations, another issue that only existed in SQL Server 6.5, or maybe being less compatible with a competitor's SQL product?

    No idea. I don't know the basis or reason for that comment in BoL

    If the BoL entry doesn't say something like "This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature." then the feature is not officially deprecated.

    Likewise you can SELECT @variable = expression [, ...] FROM query, or SET @variable = (SELECT expression FROM query). Now which of those is it that raises an error if there's more than one row... I think it's the second.

    The second. In the first, if there are multiple rows you get the value of one of them. Which row you get is not defined.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yeah, very weak article. Nothing in here that good developers don't already know, plus a number of errors and just plain bad advice.

  • Wesley Brown (8/10/2009)


    Order of operation can help speed a little bit, since you stop checking once of the conditions is met, kinda like the EXISTS issue, but you would have to be using A LOT of them to see any real difference, you may run into the optimizer picking a simple plan before this issue even comes up I'd have to do some testing to prove that out though.

    You sure about that?

    http://sqlinthewild.co.za/index.php/2007/07/29/order-of-execution/ (old post, needs updating, mostly still valid)

    While there are certain circumstance under which SQL can do position-based short circuiting of conditions in the where clause, under most circumstances the order that conditions are evaluated depends on the available indexes and the plan that the optimiser comes up with.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • ta.bu.shi.da.yu (8/10/2009)


    vetri (8/10/2009)


    Hi,

    I tried with these two statements but the second one is not get executing and it says "Incorrect syntax near the keyword 'exists'." What wrong with this or is this wrong statement?

    SELECT * FROM employee WHERE emp_no NOT IN (SELECT emp_no from emp_detail)

    SELECT * FROM employee WHERE NOT EXISTS (SELECT emp_no FROM emp_detail)

    Change the second one to:

    SELECT * FROM employee WHERE NOT EXISTS (SELECT * FROM emp_detail)

    Not the same thing. to get the equivalance of the first statement in a NOT EXISTS you need to reference the outer table.

    SELECT * FROM employee a WHERE NOT EXISTS (SELECT * FROM emp_detail b WHERE a.emp_no = b.emp_no)



    --Mark Tassin
    MCITP - SQL Server DBA
    Proud member of the Anti-RBAR alliance.
    For help with Performance click this link[/url]
    For tips on how to post your problems[/url]

  • Adrian Green (8/10/2009)


    Overall I've rated this article poorly. Sorry, a "Top Twenty Tips" sounds good (a nice bit alliteration), but unfortunately for a topic as complex as this you can only mislead.

    It doesn't say "Top Twenty Tips". It doesn't say "Twenty Good Tips", but the word "Good" is in there, so a certain expectation remains 😉

  • ronmoses (8/10/2009)


    10. WHERE clauses - In a WHERE clause, the various operators used directly affect how fast a query can run. Here are the conditional operators used in the WHERE clause, ordered by their performance.

    =, >, =, <=, , !=, !>, !' be much further up the performance list to '!<' as they are both mean exactly the same thing.

  • jacroberts (8/10/2009)


    Why should '>' be much further up the performance list to '! is twice as efficient as !< since it uses just one character :laugh:

  • This list hits on the right items but should be refined based on Gail Shaw's comments and a few other. I thought about passing this to our developers but because of a few descrepencies, will not.

  • Here's a question about one of the items. You mentioned that writing "master.dbo.proc_name" is better than just "proc_name", well, what about writing "master..proc_name"? My company does not specify the schema name so it's always "dbo". Is doing the ".." just a shortcut/default to "dbo"?

    Thanks

Viewing 15 posts - 46 through 60 (of 244 total)

You must be logged in to reply to this topic. Login to reply