TSQL LAB 2 - Writing Conditional WHERE Clauses

  • Something that may have been overlooked and I think the following code could use a slight change to insure it is a unicode string since it is being declared as nvarchar. Each hardcoded string being assigned should have an "N" prefixed to it. Although it might not have an impact in this code it may in other examples where the string is being dynamically created for this kind of thing. Something to be aware of.

    DECLARE @sql NVARCHAR(MAX)

    DECLARE @params NVARCHAR(MAX)

    -- Build the basic query

    SELECT @sql = '

    SELECT *

    FROM Production.Product

    WHERE 1=1 '

    -- Build the parameter list

    SELECT @params = '

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT'

    IF @ProductNumber IS NOT NULL

    SELECT @sql = @sql + ' AND ProductNumber LIKE @ProductNumber '

    IF @Color IS NOT NULL

    SELECT @sql = @sql + ' AND Color LIKE @Color '

    IF @ReOrderPoint IS NOT NULL

    SELECT @sql = @sql + ' AND ReorderPoint = @ReOrderPoint '

    So code should be like this everywhere it is assigning value to the nvarchar variables:

    DECLARE @sql NVARCHAR(MAX)

    DECLARE @params NVARCHAR(MAX)

    -- Build the basic query

    SELECT @sql = N'

    SELECT *

    FROM Production.Product

    WHERE 1=1 '

    -- Build the parameter list

    SELECT @params = N'

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT'

    IF @ProductNumber IS NOT NULL

    SELECT @sql = @sql + N' AND ProductNumber LIKE @ProductNumber '

    IF @Color IS NOT NULL

    SELECT @sql = @sql + N' AND Color LIKE @Color '

    IF @ReOrderPoint IS NOT NULL

    SELECT @sql = @sql + N' AND ReorderPoint = @ReOrderPoint '

    Otherwise from BOL: "Without the N prefix, the string is converted to the default code page of the database. This may not recognize certain characters."

    And it is also a good idea to try to use the correct type with each variable to avoid any issues that implicit conversions may cause.

  • Good article, though I would chnage this query:

    --EXECUTE ProductSearch5 'CA%', 'Black', 375

    CREATE PROCEDURE ProductSearch5

    (

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT

    )

    AS

    SET NOCOUNT ON

    DECLARE @sql NVARCHAR(MAX)

    DECLARE @params NVARCHAR(MAX)

    -- Build the basic query

    SELECT @sql = '

    SELECT *

    FROM Production.Product

    WHERE 1=1 '

    -- Build the parameter list

    SELECT @params = '

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT'

    IF @ProductNumber IS NOT NULL

    SELECT @sql = @sql + ' AND ProductNumber LIKE @ProductNumber '

    IF @Color IS NOT NULL

    SELECT @sql = @sql + ' AND Color LIKE @Color '

    IF @ReOrderPoint IS NOT NULL

    SELECT @sql = @sql + ' AND ReorderPoint = @ReOrderPoint '

    -- execute the query

    EXEC sp_executesql @sql, @params, @ProductNumber, @Color, @ReorderPoint

    To look like this:

    --EXECUTE ProductSearch5 'CA%', 'Black', 375

    CREATE PROCEDURE ProductSearch5

    (

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT

    )

    AS

    SET NOCOUNT ON

    DECLARE @sql NVARCHAR(MAX)

    DECLARE @params NVARCHAR(MAX)

    -- Build the basic query

    SELECT @sql = '

    SELECT *

    FROM Production.Product'

    -- Build the parameter list

    SELECT @params = '

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10),

    @ReOrderPoint INT'

    IF @ProductNumber IS NOT NULL

    SELECT @sql = @sql + ' AND ProductNumber LIKE @ProductNumber '

    IF @Color IS NOT NULL

    SELECT @sql = @sql + ' AND Color LIKE @Color '

    IF @ReOrderPoint IS NOT NULL

    SELECT @sql = @sql + ' AND ReorderPoint = @ReOrderPoint '

    SET @SQL = REPLACE(@SQL, 'FROM Production.Product AND', 'FROM Production.Product WHERE')

    -- execute the query

    EXEC sp_executesql @sql, @params, @ProductNumber, @Color, @ReorderPoint

    I removed the where clause and added a replace before executing the query.

    That's usually how I do it. Is that wrong or is it 12 of one, half dozen of the other?

  • Is that wrong or is it 12 of one, half dozen of the other?

    That would be SIX of one, half a dozen of the other. šŸ˜‰

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • You can add an unconditional WHERE part.

    ... SET @sql = ' SELECT * FROM Production.Product WHERE 1 = 1 '...

    Now you are sure that every other condition should begin with AND.

    P.S. I'm sorry if someone mentioned this before, I didn't have time to read all the posts, but I wanted to give my small contribution to the subject. Cheers

    P.S.2

    Now I see it was mentioned in the article. :w00t: Sorry!

    ----------------------------------------
    I miss SQL Server so much. Now I'm working on Oracle and hate it

  • Jon Saltzman (9/25/2009)


    Soki (9/25/2009)


    Good article

    Just a small piece of mind, in regards of the performance, SQL Server supports short circuiting, which can be a MAYOR help if you know what kind of filters will be used mostly because you can accomodate your WHERE clause to try to evaluate the minor number of conditions.

    Regards

    Soki G.

    http://www.pwnedgames.com[/quote%5D

    Soki,

    It sure does seem to support short circuiting, but I have seen performance problems on large data sets with doing too much short circuiting (see my Post #793948 above). Do you have any similar experiences?

    Hi, the odd filter seems to be putting quite a lot of overhead in your query, you're making a row by row calculation to retrieve the modulus you're right with huge amounts of data, this is a killer

    In my experience when expecting to handle large volumes of data (say, for DWH 4 example) you should try to distribute the overhead across the whole application, for example, you could make a mod when inserting data to calculate the identity it gets to decide whether it's odd or even and flag it accordingly, this makes your table have an extra column, but since you can use a bit for that it shouldn't be a problem, in BI as you may know, data is almost not normalized, the reason for this is exactly that, joining tables, making calculations on the rows and stuff makes it impossible for a report to be shown in a reasonable time, so there's a lot of redundant info that serves no purpose but making retrieval easy.

    Of course this comes with a price, for the example given before, if your deletions are physical and not logical and you need to re calculate the odd and even positions you have to make a proc to recalculate all this in the event of a del, which is analog to what is done in BI when you launch cubes regeneration.

    As for the second where clause, if you're expecting the value of date filter to only be F ot T, no need to check the 'T' value, this is insignificant probably but in the name of optimizing as much as possible, it should be noted šŸ˜›

    Hope it helps, au revoir!

    Soki

  • Soki,

    Thank you for responding and your comments! Here is a slightly different (self contained) example that I think highlights my question better:

    --build a temp table with an integer field that is a primary key, clustered index

    create table #t (intfield int primary key);

    insert into #t values (1);

    insert into #t values (2);

    insert into #t values (3);

    declare @operator_switch bit;

    declare @check_int int;

    set @operator_switch = 1;

    set @check_int = 2;

    --uses index scan, but is statically typed

    select

    t.intfield

    from

    t

    where

    (

    (@operator_switch = 1 and t.intfield = @check_int)

    or (@operator_switch = 0 and t.intfield > @check_int)

    )

    --uses index seek, but is dynamic

    declare @sql nvarchar(2000);

    set @sql = '

    select

    t.intfield

    from

    t

    where

    (t.intfield ' + (case when @operator_switch = '1' then '=' else '>' end) + convert(varchar,@check_int) + ')'

    exec sp_executesql @sql;

    drop table #t;

    You should be able to just copy and paste into a query window and try it.

    There are two queries, the first one is the way I'd like to write the conditional statement so that it is statically typed and I don't have to resort to dynamic sql. I cannot omit the "@operator_switch = 0" part from the second clause or the query would then "OR" the two conditions and return too much information.

    The first select query uses a scan, the second a seek. I know it's asking a lot of the query optimizer, but I wish it would realize that, in the first select statement, there are two logic branches that the query can go down, and optimize both of them so that it would use the seek. But that is also asking a lot of the optimizer.

    Any more thoughts?

    Jon

  • SQL's string manipulation performance is good enough for generating dynamic SQL.

    Its when you are manipulating (particularly concatenating) strings for every record in large tables you need to decide where to put it.

  • Nice article, Jacob.

    In terms of performance, there should be no difference between EXEC and sp_executesql when no parameters are used. When params are used with sp_executesql, its a different story because the QO is able to sniff the param value and create a plan optimized for that parameter. If the data distribution in the table is about the same for each param you send in, then everyone wins. However, if you have some outliers, ones that should perform a scan operation instead of a seek, then the plan in cache would not be optimal but would be reused.

  • Excellent article, and to think I almost didn't read it. I got several elegant solutions to common problems, just a few more tools in my toolbet.

    I am not completely clear on the sp_executesql and how it prevents SQL Injection, so you have given me some more studying to do as well.

    Thanks,

  • Swirl80 (9/25/2009)


    the issue i have with procedures like these is that the developers at our place create a procedure for all eventualities which is fine BUT, a lot of these optional where clauses require a join to a table that is not necessary unless that option is selected. Therefore, we have some queries with several LEFT JOIN's that are not necessary for certain selections and therefore causing additional reads when not necessary. This can cause chaos on very large tables.

    Don't get me wrong, i don't mind this method, but only on small data sets. Anything large then i recommend seperate stored procedures, purely from a performance point of view, as stated earlier this can have additional administrative overhead in having to say update 6 procs as opposed to 1 but if it means the process runs in less than half a second as opposed to 10s then i'm happy to do that!

    If you are writing separate stored procedures for each case, it would be easy enough to leave out unnecessary joins.

    I'm wondering if anyone has used stored procedure groups for this, and how they affect performance/maintenance.

    For example:

    CREATE PROCEDURE ProductSearch;1

    @ProductNumber VARCHAR(20) =NULL,

    @Color VARCHAR(10) =NULL

    AS

    /**

    * ProductSearch;1

    *

    * Description: This is the default API that can be called for product searches.

    */

    BEGIN

    SET NOCOUNT ON

    DECLARE @RETURN_VALUE int

    SET @RETURN_VALUE = 1

    IF @ProductNumber IS NULL AND @Color IS NULL

    RAISERROR('Your must specify a product number and/or a color to search for products.', 10, 1)

    ELSE IF @ProductNumber IS NULL

    EXEC @RETURN_VALUE = ProductSearch;2 @Color

    ELSE IF @Color IS NULL

    EXEC @RETURN_VALUE = ProductSearch;3 @ProductNumber

    ELSE

    EXEC @RETURN_VALUE = ProductSearch;4 @ProductNumber, @Color

    RETURN @RETURN_VALUE

    END

    GO

    CREATE PROCEDURE ProductSearch;2

    @Color VARCHAR(10)

    AS

    /**

    * ProductSearch;2

    *

    * Description: Executes a product search based on product color

    */

    BEGIN

    SET NOCOUNT ON

    IF @Color IS NULL BEGIN

    RAISERROR('You searched for products by color but did not specify a color to match.', 10, 1)

    RETURN -2

    END

    SELECT *

    FROM Production.Product

    WHERE Color = @Color

    RETURN(0)

    END

    GO

    CREATE PROCEDURE ProductSearch;3

    @ProductNumber VARCHAR(20)

    AS

    /**

    * ProductSearch;3

    *

    * Description: Executes a product search based on product number

    */

    BEGIN

    SET NOCOUNT ON

    IF @Color IS NULL BEGIN

    RAISERROR('You searched for products by product number but did not specify a product number to match.', 10, 1)

    RETURN -3

    END

    SELECT *

    FROM Production.Product

    WHERE ProductNumber = @ProductNumber

    RETURN(0)

    END

    GO

    CREATE PROCEDURE ProductSearch;4

    @ProductNumber VARCHAR(20),

    @Color VARCHAR(10)

    AS

    /**

    * ProductSearch;4

    *

    * Description: Executes a product search on both product number and color

    */

    BEGIN

    SET NOCOUNT ON

    IF @ProductNumber IS NULL OR @Color IS NULL BEGIN

    RAISERROR('You searched for products by product number and color but did not specify a product number or color to match.', 10, 1)

    RETURN -4

    END

    SELECT *

    FROM Production.Product

    WHERE ProductNumber = @ProductNumber

    AND Color = @Color

    RETURN(0)

    END

    GO

    Note: You cannot seem to (right-click) "Modify" these procedures in SSMS like regular procedures, but then if you have the script stored somewhere it is easy enough to drop and recreate the procedure with any changes that need to be made.

    --Andrew

  • I had to write up a fairly detailed query based on search criteria supplied by the user (invoices on a website). The first time I wrote it up using CASE statements in the WHERE clause and found the execution time to be horrendous, sometimes up to 3 minutes to look up an invoice.

    I rewrote it using dynamic SQL using sp_executesql and it was magnitudes faster, generally taking between 1 and 5 seconds to return a record.

    I've never used the ISNULL method, so I can't speak to it's speed, but I can tell you from experience that sp_executesql is the fastest method I've found.

  • Yep, makes sense. The when using the OR clause the QO has a hard time using indexes as intended. Using ISNULL() or COALESCE() is even worse (worse because the QO is able to make some assumptions if the underlying field is not nullable). Dynamic SQL side-steps this because you're able to only specify the criteria needed, leaving the QO a clean path to determine the best plan.

  • dbowlin

    I am not completely clear on the sp_executesql and how it prevents SQL Injection, so you have given me some more studying to do as well.

    Using sp_executeSQL, queries don't have to be dynamically built with entire strings supplying constant values the way they have to be built with EXEC. Instead queries can be built to reference parameters. A mechanism is provided for passing parameters when sp_executeSQL runs. This is all well documented in BOL. Passing parameters that contain semicolons doesn't behave the same way as building a query from strings which contain semicolons.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Additional options relative to the section on conditional WHERE clause without using a Dynamic Query

    The COALESCE function can be used in a similar fashion as you are describing how to use the LIKE ISNULL() combination:

    WHERE

    ...

    AND column = COALESCE(@param, column)

    ...

    As COALESCE returns the first non-null expression from its list of arguments, if a value is passed in, the column will be compared to the parameter. If there is no value passed, the param will be overlooked and the column will be compared to itself effectively neutering that portion of the WHERE CLAUSE. This still leaves the issue identified in the article of how this type of construction does not work for certain types of comparisons but this can be solved easily and the following approach may be applied to the examples in the article as well.

    What we are ultimately trying to achieve is a condition that will always be true if there is no value passed in for a parameter. This is accomplished by defaulting the value to which the column is being compared to itself. If the comparison operator does not include some component of equality in its evaluation, this will not succeed. Since you are aware of the type of comparison you are performing, you can massage the default value appropriately to yield a true condition:

    WHERE

    ...

    AND inventory > COALESCE(@quantity, inventory-1)

    ...

    Of course, depending on the data type of the column in question, you may have to be a little more creative than this but I would think this approach should always be effective.

    WHERE

    ...

    AND hiredate < COALESCE(@eventDate, DATEADD(d, 1, hiredate)

    AND lastname < COALESCE(@quantity, lastname + '_')

    AND alphacode > COALESCE(@item, '')

    ...

    Iā€™m not sure what, if any, performance issues might exist with COALESCE compared to the CASE, IS NULL/OR options but it, I think, it is very easy to read.

    After posting, I saw the additional pages in the thread that I did not notice the first time and see that COALESCE has already been mentioned. Hopefully the extra bit on comparison operators not containing some form of '=' will still be helpful to some.

  • I observed that L. Van Staalduinen and JDCook73 already approached the case of attributes with NULL values, therefore I hope I'm coming with something new.

    I'm using both approaches based on the requirements I have, the greatest benefit of handling dynamic behavior with non-dynamic queries is that the logic could be included also in an inline table-valued function and additional filters could be applied on values not included as parameters (see for example http://sql-troubles.blogspot.com/2010/02/just-in-case-part-v-dynamic-queries.html). I found this later approach quite useful when working with SSRS.

    If I'm not mistaking dynamic queries are by definition a problem for the database engine from the performance point of view, therefore between two bad approaches we have to find the less bad or better said, the one which proves to be perform better in a number of situations. On the other side we have to balance between performance and usability, flexibility, maintenance, reusability, security and whatever might come around.

    Over the time I found/learned several techniques so I could provide dynamic behavior with non-dynamic queries. For example if the ReorderPoint contains null values, then I handle this with a query like:

    SELECT *

    FROM Production.Product

    WHERE IsNull(ReorderPoint, 0) = COALESCE(@ReorderPoint, ReorderPoint, 0)

    Now it depends also on the behavior I want and the meaning of parameters and attributes' value. If the parameter is null and I want to show all the records in the query, then the above query will do in case the compared attribute is a foreign key. If the compared attribute is a quantity and greater than 0 then the above query works, the problem comes when the values could be negative too. In such cases could be used the extreme values the data types takes, and this could be applied especially in combination with <, > or BETWEEN operators.

    The IN operator is a little trickier to handle but not impossible, for example the list of values could be parsed and retrieved using a table-valued function, something like:

    IsNull(ProductModelID, -1) IN (SELECT value FROM dbo.SplitList(IsNull(NullIf(@ProductModelIDs, ''), CAST(IsNull(ProductModelID, -1) as varchar(20)) ), ','))

    This could provide some performance increase for big tables and list of values when a primary key is defined on the returned table, though this implies that the list of values contains unique values or that the duplicates are eliminated in the function itself.

Viewing 15 posts - 76 through 90 (of 106 total)

You must be logged in to reply to this topic. Login to reply