Updating Stats

  • I'm setting up a nightly job to update stats on largest tables. What are the implications for increasing or decreasing frequency of that job?

    Thanks

  • Updating stats should be done at less trafic time to avoid blocking. Instead of doing update stats i would recommande to find out the fragmentation level and do the reorganize/rebuild accordingly. If necessary we can have the update stats for the column statisticsilding the indexes would also update the stats on the indexes by default.).

  • I would rather disagree slightly with sqlzealot-81. Index rebuild/reorganize should be part of the maintenance plan, but so should be updating the statistics. Index rebuild/reorganize is required to make the execution fast, through minimizing the traversing of the index pages etc. but the most updated statistics would make sure that optimizer opts an optimal plan. So this key difference should always be kept in mind. Moreover, fragmented index should not mean that the stats are out of date and vice versa.

    As far as blocking is concerned, it depends upon Auto Update Statistics option. If it is set to ON on the database, then we have further two options

    - Auto Update Statistics basically means, if there is an incoming query but statistics are stale, SQL Server will update statistics first before it generates an execution plan.

    - Auto Update Statistics Asynchronously on the other hand means, if there is an incoming query but statistics are stale, SQL Server uses the stale statistics to generate the execution plan, then updates the statistics afterwards.

    @Lexa

    it really depends upon your environment. If the insert/update/delete in huge tables is not 10%(I think this is the percentage but I am not sure) of the size of the table (Mcrosoft has worked on this issue in SQL 2012) , and your query execution starts taking longer, then the statistics might be getting out of date (estimated rows vs actual rows in execution plan can point it out). Then you should start sorting out that when you should update your statistics (system will slow down), after how much time the same situation occurs if you have updated the statistics etc etc. So as I said before it really depends upon the environment. But one thing for sure, very frequent statistics update OR very rare statistics update, both are not good for any environment especially when it comes to huge tables.

  • sqlzealot-81 (10/31/2011)


    Updating stats should be done at less trafic time to avoid blocking. Instead of doing update stats i would recommande to find out the fragmentation level and do the reorganize/rebuild accordingly. If necessary we can have the update stats for the column statisticsilding the indexes would also update the stats on the indexes by default.).

    Update stats basically run with NOLOCK so it can't possibly block anything. That beind said, if it's scanning a huge table it might be hitting the hds, ram & cpu pretty hard.

    It's very rare to have stats that are better left un-updated. The major problem is usually that they don't update often enough.

    In my little world I have the luxury to both do reindexing (only what's required) & updatestats full scan daily. Never had an issue with updating too much.

  • Usman Butt (11/1/2011)


    As far as blocking is concerned, it depends upon Auto Update Statistics option. If it is set to ON on the database, then we have further two options

    - Auto Update Statistics basically means, if there is an incoming query but statistics are stale, SQL Server will update statistics first before it generates an execution plan.

    - Auto Update Statistics Asynchronously on the other hand means, if there is an incoming query but statistics are stale, SQL Server uses the stale statistics to generate the execution plan, then updates the statistics afterwards.

    I'm not sure I'd consider that blocking. It's more like an extra wait. That beind said if it saves me from bad parameter sniffing or even just 1 bad plan, I can wait that extra 0.5 sec a couple times during the day. On most systems it's going to be under 1-10 ms.

    I can see a point where async makes sense if it takes 20 minutes to update the stats, but that table would need a crapload of data in it before that would happen.

  • Ninja's_RGR'us (11/1/2011)


    Usman Butt (11/1/2011)


    As far as blocking is concerned, it depends upon Auto Update Statistics option. If it is set to ON on the database, then we have further two options

    - Auto Update Statistics basically means, if there is an incoming query but statistics are stale, SQL Server will update statistics first before it generates an execution plan.

    - Auto Update Statistics Asynchronously on the other hand means, if there is an incoming query but statistics are stale, SQL Server uses the stale statistics to generate the execution plan, then updates the statistics afterwards.

    I'm not sure I'd consider that blocking. It's more like an extra wait. That beind said if it saves me from bad parameter sniffing or even just 1 bad plan, I can wait that extra 0.5 sec a couple times during the day. On most systems it's going to be under 1-10 ms.

    I can see a point where async makes sense if it takes 20 minutes to update the stats, but that table would need a crapload of data in it before that would happen.

    Yeah, I agree. If you see I was disagreeing, but somehow did not use the proper words there. Now, I realize that it meant wrong. Thanks for pointing it out.

  • Not disagreeing, just adding a 2nd point of view ;-).

  • I disagree to disagree your agreeing to second point of view and agree to disagree your point of view as disagreeing :hehe: Must be time for me to get a hot coffee 😉

  • adding my 2 cents..

    on SQL 2008 R2, I've setup a job to UpdateStats on 1 database that is 250GB in used space (400 GB alloc) w/ many compressed objects -- it executes every 3 hours (4:45am - 4:45pm) and it has not impeded anything.

    The job inlcudes 1 step comprised of this T-SQL:

    USE myDBname exec sp_updatestats

    GO

    BT
  • Express12 (11/1/2011)


    adding my 2 cents..

    on SQL 2008 R2, I've setup a job to UpdateStats on 1 database that is 250GB in used space (400 GB alloc) w/ many compressed objects -- it executes every 3 hours (4:45am - 4:45pm) and it has not impeded anything.

    The job inlcudes 1 step comprised of this T-SQL:

    USE myDBname exec sp_updatestats

    GO

    It's different than updatestats will fullscan. It filters out some stats based on many conditions. Also it doesn't update with full scan. There's another algorithm for that.

    That version is less intrusive but I agree, it shouldn't hurt during prod hours (tho it's possible it might slow the server down).

  • So basically you cannot "over" update statistics, right?

  • Lexa (11/2/2011)


    So basically you cannot "over" update statistics, right?

    Well yes. If you were to update stats every 5 seconds, the plans of the updated objects would never stay in cache (or continuously flushed). That could cause cpu spikes and eventually a bad plan could sneak in. The vast majority of the time the new plan will be good, but everytime you recompile you run the risk of having bad parameter sniffing.

  • Ninja's_RGR'us (11/1/2011)


    sqlzealot-81 (10/31/2011)


    Updating stats should be done at less trafic time to avoid blocking. Instead of doing update stats i would recommande to find out the fragmentation level and do the reorganize/rebuild accordingly. If necessary we can have the update stats for the column statisticsilding the indexes would also update the stats on the indexes by default.).

    Update stats basically run with NOLOCK so it can't possibly block anything. That beind said, if it's scanning a huge table it might be hitting the hds, ram & cpu pretty hard.

    It's very rare to have stats that are better left un-updated. The major problem is usually that they don't update often enough.

    In my little world I have the luxury to both do reindexing (only what's required) & updatestats full scan daily. Never had an issue with updating too much.

    Seems a bit late but just returning from vacations..see this as pending notification.. Can you please elaborate the NOLOCK behind the scene process? What if some rows are being inserted/deleted in the table while updating the stats? What if we try to change the data type of a column from varchar(100) to varchar(50)? Will these be blocked OR NOT?

    I think NOLOCK may not be right word to use and some blocking (Extra wait as you said 🙂 ) may occurs.

  • I've pinged Gail for this one as she knows more about those internals.

    AFAIK, the lock taken by update stat is a schema lock. Meaning you can't alter the object you're working on (like dropping it or changing columns definition).

    In a normal prod environement with controled change management, that will never cause any issues.

    Gail, feel free to tear apart what I just said!

  • Also keep in mind that stats is a guess tool with partial scan of the data (unless requiring fullscan).

    So nolock makes sense here since all you want in a good idea of what's in there. There's no point in blocking others from accessing or editing the data in the table for that task.

    Also you're probably not going to make a life or death decision based on those numbers :-)... the only time when nolock can make sense!

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply