WHERE filter is killing my query against a SQL Server 2005 database

  • I have this query that was extracted from a Crystal report that used to run against a SQL Server 2000 database. The problem is that in SQL Server 2000 the query took 5 seconds to run, and in SQL Server 2005, it takes close to 2 hours to run! Doing a database compare between the two, one of the tables in the 2005 database has two extra columns and 1 extra constraint (related to one of the new columns). These extra columns are not being used in the query though. I looked at the execution plans between the two servers and though the query is identical, the execution plans are quite different. The majority of the time is taken up by parallelism and a hash match. On the 2000 server the row count for the hash match is 419,947 with a CPU cost of 2.68. On the 2005 server teh row count is 216,709,800 with a CPU cost of 946.26! Why the big difference?

    Here is the query:

    SELECT top 1000000

    ei_tc_doc_wbs_current.prj_code,

    ei_tc_doc_wbs_current.phase_code,

    ei_tc_doc_wbs_current.task_code,

    emp_info.emp_alpha_name,

    prj_task.prj_code,

    ei_tc_doc_wbs_current.per_end_date,

    ei_tc_doc_wbs_current.org_code,

    ei_emp_group_code.emp_group_name

    FROM

    ei_tc_doc_wbs_current ei_tc_doc_wbs_current

    INNER JOIN emp_info emp_info ON ei_tc_doc_wbs_current.emp_code=emp_info.emp_code

    LEFT OUTER JOIN prj_task prj_task ON ei_tc_doc_wbs_current.prj_code=prj_task.prj_code AND ei_tc_doc_wbs_current.phase_code=prj_task.phase_code AND ei_tc_doc_wbs_current.task_code=prj_task.task_code

    INNER JOIN ei_tc_doc_head ei_tc_doc_head ON emp_info.emp_code=ei_tc_doc_head.emp_code

    INNER JOIN ei_emp_group_map ei_emp_group_map ON ei_tc_doc_head.emp_code=ei_emp_group_map.emp_code

    INNER JOIN ei_emp_group_code ei_emp_group_code ON ei_emp_group_map.lbr_group_code=ei_emp_group_code.emp_group_code

    WHERE

    ei_tc_doc_wbs_current.per_end_date<{ts '2009-11-28 00:00:00'}

    --AND prj_task.prj_code IS NULL

    I set it to cap the record pull to 1,000,000 records. Upon further investigation, the line under the where clause "AND prj_task.prj_code IS NULL" is what is killing the query. On the 2000 server I get the million rows in 19 seconds without that filter, and 182 rows in 6 seconds with the filter. On the 2005 server I get the million rows in 15 seconds without the filter, and with the filter, it's time to go see a movie.

  • First off check your statistics are up to date.

    Also can you post the execution plans?



    Clear Sky SQL
    My Blog[/url]

  • Please post table definitions, index definitions and execution plan, as per http://qa.sqlservercentral.com/articles/SQLServerCentral/66909/

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • What statistics and how do I check them?

    I'll post the execution plans in table format.

  • Statistics are used by the optimizer to get the 'fastest' query plan.

    Try this link

    http://www.developer.com/db/article.php/3622881/Basics-of-Statistics-in-SQL-Server-2005



    Clear Sky SQL
    My Blog[/url]

  • Thanks for the link GilaMonster. I'll get all that information together and post it this afternoon. I'll let the query run for the two hours so I can get the actual execution plan.

  • OK, I've (hopefully) attached everything I need to. I included table and view definitions, row counts, and execution plans.

    sql server 2000.zip contains information related to the older database I mentioned above that doesn't have a problem running that query. 20 - 24 seconds to return a million rows without the filter, 4-6 seconds with the filter.

    sql server 2005.zip contains information related to the newer database. 15-10 seconds to return a million rows without the filter, a good movie's worth of time with the filter (took 2 hr 4min to get the actual execution plan for it).

    Let me know if there was anything I forgot to submit.

    Thanks again!

  • The estimated row counts differ quite badly from the actual row counts, generally this is a sign of out of date statistics.

    There is also a big logical difference in the queries. You may have only added a filter (although it really an anti filter to test for non existent rows in the outer join) the difference is in the amount of rows SQLSever has to look at to get the top 100000 rows.

    If you look at the actual row counts in the nonfiltered query plan one of the hash match operators has stopped at 100000 rows. Where as the filtered it has built the entire result set (2402614682 rows) and filtered the results of that.

    If you remove the top clause from the non filtered query does it take the same time to execute as the filtered ?

    Im guessing it does.

    So the question now is , what is your real business query ?

    With top , or is that only for testing ?

    Or with filter ?



    Clear Sky SQL
    My Blog[/url]

  • This is insane. There's a nested loop join between thousands of rows. I cannot see why that loop join is there, it's the worst kind of join for those row counts.

    Can you do a stats update, with fullscan on all of the tables involved, see if that makes any difference to the plan. Is this a recent 2000-2005 upgrade?

    UPDATE STATISTICS <Table Name> WITH FULLSCAN

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Also in your view ei_tc_doc_wbs_current.

    The subquery 'b' , doesnt seem to have any practical use.

    The values arent used ?

    -- EDIT Scrub that....

    It used to get the row with the max(wbs_seq_nbr) value.

    It *may* be more efficient to use the ROW_NUMBER() function for that.

    --EDIT2

    Shouldnt a.line_nbr = b.line_nbr ?



    Clear Sky SQL
    My Blog[/url]

  • The tables are part of a database from an accounting package that we purchased some time ago. The definition changes from the 2000 to the 2005 are the result of thier upgraded software which required SQL Server 2005 this time. So far the whole upgrade process has gone great (we are deploying this weekend) except for this one report. The Crystal Report is a custom one created in house, but the individual that created it is long gone. Attached is the real business query exported out of Crystal Reports along with a screen shot of it's table links (from Crystal's database expert). The actual query does include the anti filter and doesn't include the top clause. The changes I made to the query were made so that it was easier on my eyes and also because I wanted to see what kind of difference it would make. The results were the same. I'm guessing with the top clause gone and the filter gone, it would take the same amount of time to run also (I'll test that later today).

    To answer the lastest post Dave, I'm not sure. I do know that any changes we make to the vendors stuff voids our support contract with them. They have made that explicitly clear. But, they are usually very helpful and if there is something they did in err, I can bring it up to them and as long as I can demonstrate the issues their error can cause, they would probably fix it for their next patch. We are allowed to add stuff to the database, however. We can create custom tables, views, stored procedures, etc.... One of my thoughts was to see what kind of a difference it would make to create a view to the prj_tasks table with the filter in place there. Then join that view to the others instead of joining the prj_tasks table.

    I'm going to update the statistics as GilaMonster recommend and then rerun the query to get a new execution plan. I'll post back with the results.

    I appreciate the help!

  • So this is a very recent upgrade? In that case you absolutely must run a statistics update on all tables. The stats from SQL 2000 are not great and the 2005 optimiser doesn't use them properly.

    You also should do a few other things:

    Change page verify to Checksum

    Run a checkDB with data purity

    Run DBCC UpdateUsage

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Yes, this is a very recent upgrade. Now that I understand what the statistics are (sort of 😉 ), I'll run those on the tables in question first to see if it fixes my problem. As for the other things you mentioned, I'll have to look them up and then run them because I'm not sure what they are.

    Thanks!

  • Is there a way to tell if statistics was already updated?

    If it's very important to update statistics, then I would be very surprised if it wasn't already taken care of by one of their upgrade scripts.

  • Run the stats updates regardless. It's one of the things that has to be done with a 2000-2005 upgrade, it's also the one thing that so many people don't do. The stats will auto update eventually, but you're risking poor performance until then

    You can use the Stats_Date function to cehck when the stats were updated, but there's no harm from updating a second time and it shouldn't take all that long.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply