Performance large table

  • Hello,

    I got a database with a table (meterdata) which has little of 70 million rows. Often new rows are added to be precise every 15 mins from between 500 a 800 energy readers, they are writing their latest value in that table.

    There is a select statement running once and a while over the data which takes between 06:50 and 08:00 mins

    now iam in the luxery to have an extra server which is doing nothing at the moment where i did restore a backup of that database on and i can play arround with indexes compression, partitions ect some different scenario's. This is mainly eductive for me to learn how to approach this kind of issue's because the users are not complaining.....Yet. But the server where it is running on is having a memory issue PLE when the query is being ran

    SQL 2005 - default 7:42 min/sec (current production)

    My test envoriment SQL 2008 enterprise same hw stats as the sql 2005 prd server

    SQL 2008 - 07:01 min/sec

    SQL 2008 - maxdop 1 - 06:58

    SQL 2008 - compresed row - 6:52 min/sec

    SQL 2008 - partioned (4 filegroups 4 files <2010 <2011 <2012 and the rest ) without indexes -- 07:10

    SQL 2008 - partioned (4 filegroups 4 files <2010 <2011 <2012 and the rest ) with recommended index -- 07:04

    The query:

    SELECT

    "dbo"."TreeView"."ParentID" ,

    "dbo"."MeterData"."MeterID" ,

    "dbo"."MeterData"."ActualTimeStamp" ,

    "dbo"."MeterData"."Value"

    FROM

    "dbo"."TreeView",

    "dbo"."Meter",

    "dbo"."MeterData",

    "dbo"."Meter" "MS1"

    WHERE

    (((((("dbo"."TreeView"."ObjectTypeID" = 3 ) AND ("dbo"."TreeView"."ObjectID" = "dbo"."Meter"."MeterID" ) )

    AND ("dbo"."Meter"."MeterTypeID" = 9 ) ) AND ("dbo"."TreeView"."ObjectID" = "dbo"."MeterData"."MeterID" ) )

    AND ("dbo"."TreeView"."ObjectID" = "MS1"."MeterID" ) ) AND ("MS1"."SaveData" = 1) )

    The rows returned:

    53518235

    And i added the latest execution plan and a meterdata.txt file where the definition of the meterdata table is located and the current index.

    from:

    (SQL 2008 - partioned (4 filegroups 4 files <2010 <2011 <2012 and the rest ) with recommended index -- 07:04)

    Iam looking for suggestion how can i improve this what are the possibilities here?

  • Buy bigger server.

    Seriously what end user would actually look at 53 M rows??????????????????????????

    You need to redesign the process to return less data (or summerized).

  • What is consumming the data ?

    The plan estimates then data size to be ~500mb , it has also underestimated the amount of rows by a factor of 2. So the final result set should be ~1gb.

    I would hazard a guess that that is your bottleneck not SQLServer, are you seeing the task waiting on NETWORK_IO ?

    Dave



    Clear Sky SQL
    My Blog[/url]

  • For that much data (both in rows and GBs), 8 minutes is not slow by most standards.

    The app also has to consume the data so that will slow down the process even if the server and the network are not bottlenecks.

  • Statistics might be out of date. A nested loop on 53million rows is pretty frightening. You have full optimization, so the optimizer thinks this is the right way to go, but it sure looks messed up to me.

    BTW, you're using ANSI 89 syntax. I'd suggest updating that. It makes the code more readable, and you avoid issues with OUTER joins.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Grant Fritchey (1/4/2012)


    Statistics might be out of date. A nested loop on 53million rows is pretty frightening. You have full optimization, so the optimizer thinks this is the right way to go, but it sure looks messed up to me.

    BTW, you're using ANSI 89 syntax. I'd suggest updating that. It makes the code more readable, and you avoid issues with OUTER joins.

    +1, but with only 2X error on the stats they don't seem completely out of whack.

    Even if the join operator changes, I don't see that query going down by more than 1 minute, if that.

  • Ninja's_RGR'us (1/4/2012)


    Grant Fritchey (1/4/2012)


    Statistics might be out of date. A nested loop on 53million rows is pretty frightening. You have full optimization, so the optimizer thinks this is the right way to go, but it sure looks messed up to me.

    BTW, you're using ANSI 89 syntax. I'd suggest updating that. It makes the code more readable, and you avoid issues with OUTER joins.

    +1, but with only 2X error on the stats they don't seem completely out of whack.

    Even if the join operator changes, I don't see that query going down by more than 1 minute, if that.

    Yeah probably. returning that much data to the users is just crazy.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Oke thanks for the feedback sofar.

    Some answers:

    Seriously what end user would actually look at 53 M rows??????????????????????????

    I have no idea, they run some kind of trend analyse on the the energy data to see how their usage is going.

    Imho opinion if they record only once a hour the value of a meter instead of every 15 min the rows will decrease 75% and through time they still have a nice trend.

    What is consumming the data ?

    The app for trend analyse. But for my case iam running the query in SSMS and dump the result to file. This is mostly a case for me.

    would hazard a guess that that is your bottleneck not SQLServer, are you seeing the task waiting on NETWORK_IO ?

    yes and CXPACKET, did a rerun with maxdop 1 and its only async_networkIO

    @Grant Fritchey

    Thanks for the feed back i will take a look at the code and make the changes.

  • CXPACKET is not a real wait, especially not in this query.

    If you want to toy with this you can use a few join hints to see the difference (merge & hash). I wouldn't expect much, but you never know.

    53M rows is 50 times what excel can handle. You guys need to rethink your solution.

  • That being the case , maybe a simple SSIS task would work better for you here ?



    Clear Sky SQL
    My Blog[/url]

  • Dave Ballantyne (1/4/2012)


    That being the case , maybe a simple SSIS would be better for you here ?

    How would this go in anything less than 6 minutes in SSIS?

    How would SSIS correct the design flaw?

  • Ninja's_RGR'us (1/4/2012)


    Dave Ballantyne (1/4/2012)


    That being the case , maybe a simple SSIS would be better for you here ?

    How would this go in anything less than 6 minutes in SSIS?

    How would SSIS correct the design flaw?

    Its not going to help *much* and certainly not solve anything , but SSMS is slow *IMO* at consuming results and drawing the screen, if all the OP is doing is then clicking r-click save as , then it may go down to 5 mins or so.

    Infact , OP , as an experiment try turning on the 'discard query results' check box, what is the time difference ?



    Clear Sky SQL
    My Blog[/url]

  • Dave Ballantyne (1/4/2012)


    Ninja's_RGR'us (1/4/2012)


    Dave Ballantyne (1/4/2012)


    That being the case , maybe a simple SSIS would be better for you here ?

    How would this go in anything less than 6 minutes in SSIS?

    How would SSIS correct the design flaw?

    Its not going to help *much* and certainly not solve anything , but SSMS is slow *IMO* at consuming results and drawing the screen, if all the OP is doing is then clicking r-click save as , then it may go down to 5 mins or so.

    Infact , OP , as an experiment try turning on the 'discard query results' check box, what is the time difference ?

    iam using result to file so i dont have the overhead...on a side note it will not even complete if i choice any other method as it will run out of mem (SSMS that is)

  • Marco V (1/4/2012)


    Dave Ballantyne (1/4/2012)


    Ninja's_RGR'us (1/4/2012)


    Dave Ballantyne (1/4/2012)


    That being the case , maybe a simple SSIS would be better for you here ?

    How would this go in anything less than 6 minutes in SSIS?

    How would SSIS correct the design flaw?

    Its not going to help *much* and certainly not solve anything , but SSMS is slow *IMO* at consuming results and drawing the screen, if all the OP is doing is then clicking r-click save as , then it may go down to 5 mins or so.

    Infact , OP , as an experiment try turning on the 'discard query results' check box, what is the time difference ?

    iam using result to file so i dont have the overhead...on a side note it will not even complete if i choice any other method as it will run out of mem (SSMS that is)

    Won't run out of ram if you discard the results.

    If your powerful dev machine can't handle that much data, what will the users' machine do! 😉

  • Marco V (1/4/2012)

    iam using result to file so i dont have the overhead...on a side note it will not even complete if i choice any other method as it will run out of mem (SSMS that is)

    *Might* still be worth a try , it is SSIS's job to move data around , SSMS does it as a side issue.

    I havent done any form of benchmarking in this area previously , i wouldnt imagine any one else has either.



    Clear Sky SQL
    My Blog[/url]

Viewing 15 posts - 1 through 15 (of 23 total)

You must be logged in to reply to this topic. Login to reply