If it looks like a duck...

  • Hi all,

    I've been working on some queries that lately have been causing us headaches. As a general rule we try to stay away from index and join hints in our queries, but sometimes it seems like SQL just takes a leave of it's senses. (keeps us DBA's in business <g>)

    Our most precious resource at the moment is cache memory and so we're optimizing for number of reads and not necessarily for execution time. SQL often appears to favor cpu time over disk.

    The simplest example I have are two tables joined together, one is 9097 rows and the other is 28. SQL's default plan is a loop join, which results in one scan and 662 reads from the large table but 9097 scans and 9097 reads from the smaller one. Forcing an "inner hash" join results in one scan of each table, 662 reads from the large and one read from the small. The CPU time was also reduced in half. Yet, QA's execution plan shows a 24%/76% split on cost, with the 24% going to the query with the loop join. From this it would seem that the forced plan is 3x more expensive than the default one. By my reckoning, 663 reads is a whole lot more optimal than 9759.

    So I guess I'm interested in others' experience. If forcing a hash join results in lower CPU time, lower total reads, and thinner arrows in the execution plan, even though the estimated query cost is higher, is it a better query?

    Thoughts? Thanks!

    Randy Rabin

    ChannelAdvisor Corp.

    Edited by - randy rabin on 05/13/2003 08:50:50 AM

    Edited by - randy rabin on 05/13/2003 08:53:48 AM

  • As an FYI, the Query Optimizer leans heavily towards the execution plan that results in the least amount of time to complete. That may be at the expense of CPU, I/O, or memory usage.

    As far as whether it is a better query, it depends on the typical load on your server. Which resource is underutilized? If you have a plethora of CPU cycles going to waste, you can tailor the queries towards heavier CPU use. In my experience, I/O has always been the more precious resource.


    David R Buckingham, MCDBA,MCSA,MCP

  • IOs aint IOs.

    There are two physical types of IOs - direct and sequential - and other terminology depending on what function you are using to look at it.

    Firstly, the popup information in a query analyzer plan is always an estimate, even the "actual" plan.

    If you set ConnectionOptions ShowIO, you get logical, physical, and read-ahead reads.

    Logical reads include reads satisfied from buffer so not a good performance measure.

    Read-ahead reads are mostly sequential reads and physical reads are mostly direct reads, but nnot completely.

    The problem is that sequential reads are much faster than direct reads. So 1,000 sequential reads could be faster than 100 direct reads. SQLServer reflects this by reads in table scans are sequential, and are assigned lower cost, index/bookmark reads are more likely direct reads.

    My observation is that the cost estimates favours sequential/scan a bit too much, but you cannot easily compare the different types. My rough guide is 8/1, but possibly better with very good defrag.

    See Disk IO Performance in

    http://msdn.microsoft.com/library/techart/msdn_sql7perftune.htm

    (I think there is a SQL2000 version of this but the principles are the same)

    In particular, a looped tablescan of a small table will probably have all the data in memory once it has done the first scan, which is why it may be cheaper despite having much higher "logical" reads.

    The downside of tablescans is they fill up memory with data you may not have wanted, and that may be relevant to you, beyond the performance of each query individually.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply