Performance tuning

  • True, with reporting queries however data is typically being read from a data store or warehouse and not the production OLTP system so the chance of a dirty read is slim. Not being a super tuning guy one thing I've been told is that using 'IN' is not as sargable as 'EXISTS' and as a result doesn't perform as well. That's just what I've been told, I'm not sure if its 100% correct...

    Cheers,

    Ben


    Cheers,

    Ben Sullins
    bensullins.com
    Beer is my primary key...

  • What I see as problem is a fundemental one.

    First from you are making a range selection from your Weblog table and also your WHERE clasue is using the logWritten and the urlStemID columns.

    You will have to check the selectivity of urlStemID and test which is the best solution. My suggestion is a clustered index on logWritten, urlStemID.

    Also I would replace the count(WebLogID) expression by count(logWritten) so the optimiser can serve the query directly from the index

    Of course you should have a clustered index of the URLStem table on the StemText column (altervatively StemText, urlStemID).



    Bye
    Gabor

  • Curious where did you come across information that EXISTS performs better than IN. I have tested EXISTS, IN, ANY and INNER JOIN for these types of queries and only the INNER JOIN perofrms any better under all the circumstances I have available, the others all perform equally as well. So I would just like to see what they meant to know for sure. I just love testing things.

  • Antares,

    Somewhere I have read that EXISTS perform better then IN because the optimiser stops serching at the first occurance when using EXISTS but searches for each cases listed in the IN clause



    Bye
    Gabor

  • Just my 2 cents... My testing shows that following all have identical execution plans and they all take an average of 1004 milliseconds to run...

    DECLARE @Bitbucket  INT

    --===== WHERE EXISTS with correlated sub-query

     SELECT @Bitbucket = t2.RowNum

       FROM JBMTest t2

      WHERE EXISTS (SELECT 1 FROM JBMTest t1 WHERE t1.RowNum <= 1000000 AND t1.RowNum = t2.RowNum )

    --===== Derived table

     SELECT @Bitbucket = t1.RowNum

       FROM JBMTest t1,

            (SELECT RowNum FROM JBMTest WHERE RowNum <= 1000000) d

      WHERE t1.RowNum = d.RowNum

    --===== WHERE IN

     SELECT @Bitbucket = RowNum

       FROM JBMTest

      WHERE RowNum IN (SELECT RowNum FROM JBMTest WHERE RowNum <= 1000000)

    I prefer the readability and separate testability of the Derived table, especially if more than two tables and/or queries are involved, so that's what I use.

    All else being equal, I don't like the WHERE EXISTS at all because the correlated sub-query requires the whole shootin' match to be executed in validation testing/troubleshooting instead of being able to execute the sub-queries separately.  So far as WHERE IN goes, I just think its "bad form" and I don't use it except for adhoc queries that search a list of constants in the sub-query... easier than making a table for the constants...

    Also, the WHERE IN can only be used if a single column is involved... another reason why I don't use it.

    WHERE EXISTS does have a couple of cool uses because of the correlation like returning the top 2 for each "thing"... I just hate troubleshooting correlated sub-queries.

    But, that's just my 2 cents

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff, be careful with your derived table example:

    --===== Derived table

     SELECT @Bitbucket = t1.RowNum

       FROM JBMTest t1,

            (SELECT RowNum FROM JBMTest WHERE RowNum <= 1000000) d

      WHERE t1.RowNum = d.RowNum

    IIRC, Microsoft has stated that this old join syntax will not be supported at some point in the future.  ANSI JOIN syntax is preferred.

    --===== Derived table, ANSI Syntax?

     SELECT @Bitbucket = t1.RowNum

       FROM JBMTest t1 INNER JOIN

            (SELECT RowNum FROM JBMTest WHERE RowNum <= 1000000) d

      ON t1.RowNum = d.RowNum

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • Add the following column to your table:

    Alter Table dbo.WebLog Add ilogWritten As Cast(Convert(varchar, logWritten, 112) as int)

    Then do this query:

    Declare @istartDate int, @iendDate int

    Select @istartDate = Cast(Convert(varchar, @startDate, 112) as int), @iendDate = Cast(Convert(varchar, @endDate, 112) as int)

    Select ilogWritten, Year(logWritten) as [Year], MONTH(logWritten) as [Month],Day(logWritten) as [Day of Month], count(WebLogID) as 'Visits'

    From Weblog wl with(nolock)

    Inner Join URLStem us with(nolock) on wl.urlStemID = us.urlStemID

    Where StemText LIKE '%' + @strSearch + '%'

    and ilogWritten Between @istartDate AND @iendDate

    GROUP BY ilogWritten

    ORDER BY ilogWritten

     

    Also note that I change the first line fo the criteria to: where StemText LIKE '%' + @strSearch + '%'

    That's very important. Without concatenating the wildcard operator to the search parameter, the query will only return results where StemText exactly equals @strSearch.

     


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • Thanks... that's gotta be a 2005 thing, huh?  Wondered when they would get around to forcing that issue...

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Indeed it is 2005.  The = (i.e. inner join) still works.  However, per 2005 BOL reference ms-help://MS.SQLCC.v9/MS.SQLSVR.v9.en/tsqlref9/html/36b19e68-94f6-4539-aeb1-79f5312e4263.htm

     

    The FROM clause supports the SQL-92-SQL syntax for joined tables and derived tables. SQL-92 syntax provides the INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER, and CROSS join operators.

    The outer join operators (*= and =*) are not supported when the compatibility level of the database is set to 90.

     

    The error thrown is:

    Msg 4147, Level 15, State 1, Line 3

    The query uses non-ANSI outer join operators ("*=" or "=*"). To run this query without modification, please set the compatibility level for current database to 80 or lower, using stored procedure sp_dbcmptlevel. It is strongly recommended to rewrite the query using ANSI outer join operators (LEFT OUTER JOIN, RIGHT OUTER JOIN). In the future versions of SQL Server, non-ANSI join operators will not be supported even in backward-compatibility modes.

     

    Take careful note of the last sentence in that error message!!

     

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 9 posts - 16 through 23 (of 23 total)

You must be logged in to reply to this topic. Login to reply