Q about using functions in select statements

  • In quite a few stored procedures I use the same business logic to calculate prices. So each time the customer requires another refinement I have to modify every stored procedure containing the logic for calculating the prices.

    I would very much like to encapsulate the logic in a function, but I am afraid that this would hurt the performance. A very simple example:

    1. SELECT 1 + 1 FROM a table containing 150000 rows : almost 0 seconds

    2. SELECT dbo.Calc(1,1) FROM a table containing 150000 rows : 3 seconds, with dbo.Calc RETURNS @parm1 + @parm1

    Based on this I am inclined to believe that I cannot use a function.

    Does anyone have an idea of how to place the logic in one place?

    Thanks for any input. Gerry.


    Dutch Anti-RBAR League

  • I depends on your application.  If this is in a reporting environment, and querys can take several minutes to run, then what difference does a few seconds make.  If it's an OLTP applcation, then you need to ask why do you need to calculate the prices across 150,000 rows UNLESS the business need is to have the price displayed from all 150,000.

    Even if prices can change every few minutes, you could have a job that runs to calcuate this aggregate (I'm assuming this is some kind of aggregation) and update a table from which your stored procedures reference.

    Just some thoughts for a Monday morning.

  • It is an OLTP app, and I used 150000 rows to demonstrate the difference in performance. Normally would be something like 1000 rows.

    The calculation is quite complex (lots of percentages, case statements, dependant on settings in multiple tables, valuta problems, etcetera). It is used in an order entry application, and for each customer I need to display the stock which the correct prices.

    Thank you for your feedback.

    Gerry, on a late Monday afternoon


    Dutch Anti-RBAR League

  • If the Update RATE is not TOO high (That means Test, Test and Test)

    I would calculate at Update / Insert Time and save results on a DENORMALIZED Field!

    so when you run the select ACCROSS 1500000 records computations are out of the question!

    HTH

     


    * Noel

  • I agree with the above comments. Also, I've found using a UDF instead of "inline" calculation is almost always slower (I have not found it to be faster, ever... but who knows). In my case there are some "applications" of code where performance is the top priority, and in those areas I've relgated myself to "inline" calculation coding as you have in your SPs.

    The only other thing I can offer, is have the calculation code in 1 place, and then (re)generate the SPs using the calculation. i.e. Have an SP that ALTERs your SPs.

    Or, If possible, isolate the calculation code from the various SPs using it, and call a single SP from all the others to perfom the calculation. This may only work if you have pretty much the same result set querys in all your various SPs.

    If you do end up using a UDF, take some time to tweak the complex calculation using something like the following to get exact Millisecond timing differences between UDFTest1 7 UDFTest2.

    DECLARE @Notes varchar(2000), @Start DateTime

    SELECT @Start = GetDate()

    SELECT Count(*), UDFTest1() FROM Table

    SELECT @Notes = 'Org:' + Convert(Varchar(10), DateDiff(ms, @Start, GetDate()))

    raiserror (@Notes,0,1) With NoWait

    SELECT @Start = GetDate()

    SELECT Count(*), UDFTest2() FROM Table

    SELECT @Notes = 'Org:' + Convert(Varchar(10), DateDiff(ms, @Start, GetDate()))

    raiserror (@Notes,0,1) With NoWait



    Once you understand the BITs, all the pieces come together

  • In quite a few stored procedures I use the same business logic to calculate prices. So each time the customer requires another refinement I have to modify every stored procedure containing the logic for calculating the prices.

    Maybe I miss something, but I would start to place the business logic in one single procedure as this should basically almost always be the same calculation process and reuse this procedure by nesting it into other procedures.

    As has been mentioned above, I bet you won't find a case where a UDF beats a set based approach on performance.

    --
    Frank Kalis
    Microsoft SQL Server MVP
    Webmaster: http://www.insidesql.org/blogs
    My blog: http://www.insidesql.org/blogs/frankkalis/[/url]

  • Try to avoid calculation in the SELECT statement if you use a lot of the calculation's result and contain of thousands of rows again and again.

    Instead, try to add one more column in the table (if all columns in the calculation are reside in the same table). The column is using "formula" (available in table design in Enterprise Manager) to calculate during insert or update.

    If there are columns from different tables, try to add one more table with foreign keys contraints to all the related tables. Add one more column like what explain above.

    Now, you can save hundred of seconds when selecting the result!

    Hope this will help.



    Regards,
    kokyan

  • First of all, thank you for your suggestions.

    Some solutions will not work. The prices are amongst others dependant op selected customer, products, various productspecifications, contents of package, and the package. Each customer can have different prices.

    Frank notes: "(...) I would  place the business logic in one single procedure (...) and reuse this procedure by nesting it into other procedures. " and "(...)  I bet you won't find a case where a UDF beats a set based approach on performance." I agree with both but I am not convinced that nesting will not hurt the performance. Besides I would need a temptable to store the stock, so that the nested SP can calculate the prices. I will test that.

    The suggestion of ThomasH: "(...)have the calculation code in 1 place, and then (re)generate the SPs using the calculation. i.e. Have an SP that ALTERs your SPs." Also something that needs to be tried.

    Some nice ammunition to tackle this problem, eh challenge.

    Thanks again. Gerry.

     


    Dutch Anti-RBAR League

  • Gerry,

    There is something else you should think about, specifically related to object oriented concepts.  Putting your complex code in a function makes a whole lot of sense when you're talking about code re-use.  This is especially important when dealing with business rules in the data tier; all business rules should be disconnected from the data access code as much as possible.  Currently the way to do this in SQL is through functions.

    Personally, I've found that well developed functions will perform quite satisfactorily if they are written right.   Don't reference objects outside of the function (if you do use "schemabinding")  Don't imbed functions in functions or make them recursive (use While loops instead).  Really, deterministic functions perform better than non-deterministic ones, especially in procedures, so that's what your shooting for.  

    The test you reference in your post are not accurate.  Adding 1 + 1 can be executed once for the same batch (effectively a constant), while adding value1 and value2 results in a variable that needs to be executed for each row.  The function that simulates "1 + 1" would read as:

    create function Calc1and1 ()

    returns int

    as

    begin

    return 1 + 1

    end

    This function will perform the same as adding 1 + 1 in your column, as it takes no parameters and can be executed once for the whole dataset.

    I've included a more accurate test below.  Functions were created for SQL for EXACTLY the reasons you described.  Maintaining business rules has never been easier.

    -----------------------------------------------

    if object_ID('tempdb..##Table') is not null drop table ##Table

    create table ##Table (Value1 int identity Not Null, Value2 int)

    go

    While  1 = 1

     begin

     Insert ##Table Values (@@Identity*2)

     if @@Identity = 150000 break

     end

    go

    create function TEMPCalc (@Value1 int, @Value2 int)

    returns int

    as

    begin

    return @Value1 + @Value2

    end

    go

    create function TEMPCalc1and1 ()

    returns int

    as

    begin

    return 1 + 1

    end

    go

    SELECT Value1 + Value2 FROM ##Table

    SELECT dbo.TEMPCalc(Value1, Value2) FROM ##Table

    SELECT 1 + 1 FROM ##Table

    SELECT dbo.TEMPCalc1and1() FROM ##Table

     

     

    Signature is NULL

  • Calvin, thanks for your reply. I did ignore the multi-tier aspect as you pointed out. But alas, developers would like an OO implementation, customers demand the best performance. And I'm caught inbetween

    You were right about my inaccurate test, but the results did not differ much with your functions.

    Anyway, the search goes on...

    Gerry


    Dutch Anti-RBAR League

  • Gerry,

    Yeah, there's always a trade-off.  I found that over all OO programming does not perform as well as set based procedural programming like in SQL.  I'm all about performance, which is why I love SQL.  However, when code needs to be maintained or updated OO really shines.

    I've had luck with in the past is using functions that return a table data type.  These tend to perform much better than a scalar function, as they take advantage of SQLs set processing.  This does make the function less re-usable, but it does encapsulate your code, which would be the main attraction here.

    There's usually a way to have your cake and eat it to.  Maybe an index or two will speed up your query enough to allow you to use a scalar function.  The more preprocessing you can do on the data the better your lookups will be.  A Select is a inefficient place to put business logic, maybe think about putting it in a Update or Insert instead.

    Lot's of options...Good luck!

    cl

    Signature is NULL

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply