Out of date Statistics

  • I am a developper who at times in my career had to fill the  accidental dba role as well; and I have a disagreement with my dba.

    I need your help either to make me understand my dba point of view or give me some substantiation to my claims.

    We have a complicated setup on which I don't make the decisions, so here it goes:

    • We have a database which is replicated everyday through a job using backup and restore
    • the original database has all the insert and update done
    • my copy is used for reporting
    • I can add indexes in the replication job

    So I have a table with an index that covers all the columns of my join in my stored procedure, and my stored procedure grinds to a halt a few weeks ago. I checked the statistics and sure enough, the "automatic" statistics (_WA_Sys_something) were out of date by 6 months and there has been quite significant inserts in that table since (10 to 20% increase roughly). I update them and my stored procedure works as fast as before. The query plan uses my index and it goes fast enough.

    According to my dba, _WA_Sys_[...] are all but useless because they were created with a SELECT with no covering index and so, if all indexes were properly done, we would'nt need those. So he doesn't want to update them.

    Unlike my dba, I get the impression that the optimizer uses all the data available and if a statistic implies there is a better way than using the current indexes, it will try to do so. So to me it is worth having a maintenance scheduled that updates the statistics, statistics won't just mask the fact that indexes are missing but it will help the optimizer use my indexes (although it might mask missing indexes as well I suppose).

    What do you think?

  • I hate blanket statements like what you say your DBA said.  If he said that, he's wrong.  Building enough indexes to negate the need for ANY such column statistics would like require at least twice as much disk space and probably memory space, as well.  The "system-built" stats need to be properly maintained until he somehow experiences the impossible utopia of having enough indexes to cover all queries.  Of course, it would take a month of Sundays to do inserts and many of the updates, but whatever.  Of course, the stats rebuilds on all of those supporting indexes would probably take even longer. 😀

    As with all else, though, "It Depends".  He might have some esoteric knowledge of the system that we don't (but I can't imagine what it might be unless they used some software that creates stats on every bloody column like my predecessors did and I got rid of both the software and the stats to let them "occur" naturally where needed).  It is unfortunate that some DBAs try to develop "panacea practices" for those things where a panacea doesn't come close to solving the real world problems.  Supposed "Best Practices" index maintenance is another one of those same types of mistakes (but has a much more serious impact with multiple tentacles that many folks don't know about).

    Also, stats dates can be seriously misleading.  For example, should I rebuild stats on a reference table that's NEVER inserted into or updated just because the stats date says that stats were last rebuilt a decade ago?  No.  You should also include the "RowModCounter" in such evaluations.

    It'll be tough to convince your DBA to deviate from what he's doing because he sounds a bit entrenched in a particular opinion so you might just want to make rebuilding stats a part of your restore/copy code.

     

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks for your answer.

    You are probably right, I should include it in my backup/restore routine. I tried the sp_updatestats procedure and it did a fine job in a reasonable time, but I need to study the subject further.

    It feels like a lot of waisted time though, rebuilding the stats everyday on my database when it could be done every other week on the origin database: this is a mature system and except for the odd burst of inserts in an obscure table, the inserts are rather homogenous duringthe year. It is probably why we didn't have more problems with this.

    I didn't find many resources conerning the system built stats, if anyone has interesting links, I'd be glad to learn more on the subject.

  • You, sir, are preaching to the choir.  I wish your DBA understood the actual value of those system generated column stats.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply