Statistics Update Frequency

  • What's the best way to know which index statistics need to be updated on a daily basis?  I currently update stats on Friday, but I've noticed SQL is updating some of my stats automatically.  The issue is the auto update of stats doesn't do a full scan, so it causes slowness and bad query plans.   What's the best method to figure out which tables i need to manually do full scans of stats on a daily basis so SQL server doesn't need to do an auto update of stats?

  • "Tell" SQL to always use a full table scan for those indexes that need it.  Do that by running this one time at your convenience:

    -- index_name is optional, of course

    UPDATE STATISTICS  table_or_indexed_view_name   [index_name]  WITH FULLSCAN, PERSIST_SAMPLE_PERCENT = ON;

    From then on, even an automatic update of stats should do a full scan ... unless you TRUNCATE the table, which wipes out all the stats settings.

    You may want to consider using ASYNC (deferred) for stats using a FULLSCAN if you need to do this on large tables in the database.  Most likely you can avoid ASYNC without undue delays to queries, but look into it yourself anyway, before it bites you.  The async option, unfortunately, applies to the entire db, not just specific table(s)/index(es), so you'll need to consider that setting very carefully.  [Hopefully in the future MS will take this option down to at least the table level.]

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Sadly, there isn't a magic formula that nails this. Some individual statistics may need to be updated multiple times a day. Others may never need to be updated manually. It totally depends on the data, the distribution, the frequency of updates, the type of updates, etc.. There just isn't a hard and fast rule I'm aware of. The only thing I would say is, be very flexible and be willing to update statistics as needed. Don't try to force an entire system to behave one way. It just won't work well.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Jackie Lowery wrote:

    What's the best way to know which index statistics need to be updated on a daily basis?  I currently update stats on Friday, but I've noticed SQL is updating some of my stats automatically.  The issue is the auto update of stats doesn't do a full scan, so it causes slowness and bad query plans.   What's the best method to figure out which tables i need to manually do full scans of stats on a daily basis so SQL server doesn't need to do an auto update of stats?

    I agree with Scott on persisting the sample percent.

    As to your other inquiry about how to determine which tables need full scans, I'd say any indexes that are based  on an "ever-increasing" key, such as IDENTITY, SEQUENCE, something temporal, or NEWSEQUENTIALID(), generally fit the bill unless they're static.  The indexes that are based on SEQUENCE might be a little tough to find because they don't necessarily use anything that's easily detectable, such as a default with a sequence.

    A good deal of the tables we have at our shop are necessarily based on columns that have the IDENTITY property and Clustered Indexes to match (we're working on that but that's a different subject that I'd rather not get into on this thread).  While many will take exception to it, because of that fact, I've adopted the "if it moves, shoot it" philosophy on nightly statistics rebuilds and rebuild statistics on anything that's had a change for row modifications (which includes inserts).  Yes, it takes a while.  Yes, it takes some CPU (fortunately, I'm not hurting there).  But it has helped and it's one more thing that I don't have to worry about or put up with if something is missed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I didn't know of the new persisting sample percent feature.  Thanks for that info.  I found a query that shows the date statistics were updated and days since last update.  I did a full scan on the warehouse tables yesterday (they get a lot of updates) and my plan is to check daily to see which tables receive a new auto stats update within a days time.  That should tell me which tables i need to do a manual full daily stats update on.   Does that logic sound right?  It also looks like i could use the modification_counter column to see which stats are changing the most.

    SELECT DISTINCT
    OBJECT_NAME(s.[object_id]) AS TableName,
    c.name AS ColumnName,
    s.name AS StatName,
    STATS_DATE(s.[object_id], s.stats_id) AS LastUpdated,
    DATEDIFF(d,STATS_DATE(s.[object_id], s.stats_id),getdate()) DaysOld,
    dsp.modification_counter,
    s.auto_created,
    s.user_created,
    s.no_recompute,
    s.[object_id],
    s.stats_id,
    sc.stats_column_id,
    sc.column_id
    FROM sys.stats s
    JOIN sys.stats_columns sc
    ON sc.[object_id] = s.[object_id] AND sc.stats_id = s.stats_id
    JOIN sys.columns c ON c.[object_id] = sc.[object_id] AND c.column_id = sc.column_id
    JOIN sys.partitions par ON par.[object_id] = s.[object_id]
    JOIN sys.objects obj ON par.[object_id] = obj.[object_id]
    CROSS APPLY sys.dm_db_stats_properties(sc.[object_id], s.stats_id) AS dsp
    WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1
    AND (s.auto_created = 0 ) -- OR s.user_created = 1)
    and s.name not like '[_]%'
    and OBJECT_NAME(s.[object_id]) like 'ws%'
    ORDER BY DaysOld;

    • This reply was modified 3 years, 6 months ago by  Jackie Lowery.
  • That seems about right.  I'd say also consider doing tables every day that get a manual update in 2 days' time; they're close enough to once-a-day anyway to me.  "Better safe than sorry", as Jeff effectively said.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Jeff Moden wrote:

    Jackie Lowery wrote:

    What's the best way to know which index statistics need to be updated on a daily basis?  I currently update stats on Friday, but I've noticed SQL is updating some of my stats automatically.  The issue is the auto update of stats doesn't do a full scan, so it causes slowness and bad query plans.   What's the best method to figure out which tables i need to manually do full scans of stats on a daily basis so SQL server doesn't need to do an auto update of stats?

    I agree with Scott on persisting the sample percent.

    As to your other inquiry about how to determine which tables need full scans, I'd say any indexes that are based  on an "ever-increasing" key, such as IDENTITY, SEQUENCE, something temporal, or NEWSEQUENTIALID(), generally fit the bill unless they're static.  The indexes that are based on SEQUENCE might be a little tough to find because they don't necessarily use anything that's easily detectable, such as a default with a sequence.

    A good deal of the tables we have at our shop are necessarily based on columns that have the IDENTITY property and Clustered Indexes to match (we're working on that but that's a different subject that I'd rather not get into on this thread).  While many will take exception to it, because of that fact, I've adopted the "if it moves, shoot it" philosophy on nightly statistics rebuilds and rebuild statistics on anything that's had a change for row modifications (which includes inserts).  Yes, it takes a while.  Yes, it takes some CPU (fortunately, I'm not hurting there).  But it has helped and it's one more thing that I don't have to worry about or put up with if something is missed.

    > Yes, it takes some CPU (fortunately, I'm not hurting there).  <<

    Agreed, CPU is not a concern for me either.  But the I/O required to fully scan large tables definitely is.  We've got tables with 100Ms to 1.4B rows, or so, and thus doing a full scan of them to update stats is not a trivial decision.

    On the one hand, I've corrected all the "automatically cluster by identity" on large tables, so I don't have specific issue.  On the other hand, many of the new clus keys are naturally ascending as well, such as a parent's identity or a datetime, so the same basic concern is still there.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • Yeah... there's always something when it comes to stats.  Like you said, ascending date columns can be quite the rub, as well.  We have a whole lot of tables that are like yours... a billion rows has become a bit more normal than I'd have ever liked it to become... heh... especially when a lot of folks think that 10 rows for testing is "good enough" especially in a lot of the articles that you and I have both seen over the years.  It's why I took the stance of almost always generating a million rows of test data for articles.  That now considered tiny compared to real life in today's world.

    I do wish MS would make stats a bit more intelligent.  Seems like it should be able to do more than one or a couple of columns based on whatever stats exist.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply