Rebuilding and Reorganizing Indexes

  • SwePeso wrote:

    Execute this to see the fillfactor in use. You would expect 50% right?

    DBCC SHOWCONTIG (tbl_user);

    You are wrong. SQL Server uses 100% no matter what.

    What happens if we rebuild the index directly?

    ALTER INDEX ALL ON tbl_user REBUILD;

    DBCC SHOWCONTIG (tbl_user);

    Now the fillfactor is 50% as expected.

    Thanks for the demo code, Peter.  Glad to "see" you here on SSC again.

    To be sure, I believe you've used the term "FillFactor" in a couple of places where you really meant "PageDensity".  If you view the PK for the table in sys.indexes, you'll clearly see that the index was assigned a "50" FillFactor.   I absolutely agree, though, that INSERTs in SQL Server do not observe the FillFactor (except in one rare case) and will try to fill pages to 100% without regard for the FillFactor.  A Rebuild without a WITH(FILLFACTOR=xx) doesn't assign the FillFactor... it simply uses the FillFactor found in sys.indexes.

    Here's the link that explains the "one rare case" that I was talking about, just in case anyone is interested in that particular "SQL Oolie".

    https://qa.sqlservercentral.com/articles/some-t-sql-inserts-do-follow-the-fill-factor-sql-oolie

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • This was removed by the editor as SPAM

  • Jeff Moden wrote:

    SwePeso wrote:

    Execute this to see the fillfactor in use. You would expect 50% right?

    DBCC SHOWCONTIG (tbl_user);

    You are wrong. SQL Server uses 100% no matter what.

    What happens if we rebuild the index directly?

    ALTER INDEX ALL ON tbl_user REBUILD; DBCC SHOWCONTIG (tbl_user); Now the fillfactor is 50% as expected.

    Thanks for the demo code, Peter.  Glad to "see" you here on SSC again.

    To be sure, I believe you've used the term "FillFactor" in a couple of places where you really meant "PageDensity".  If you view the PK for the table in sys.indexes, you'll clearly see that the index was assigned a "50" FillFactor.   I absolutely agree, though, that INSERTs in SQL Server do not observe the FillFactor (except in one rare case) and will try to fill pages to 100% without regard for the FillFactor.  A Rebuild without a WITH(FILLFACTOR=xx) doesn't assign the FillFactor... it simply uses the FillFactor found in sys.indexes.

    Here's the link that explains the "one rare case" that I was talking about, just in case anyone is interested in that particular "SQL Oolie".

    https://qa.sqlservercentral.com/articles/some-t-sql-inserts-do-follow-the-fill-factor-sql-oolie

    I am not sure where this misunderstanding of fill factor has come from...but it seems that a lot of people believe that it should somehow control how full a page can become as data is entered.  That would be useless as it would permanently exclude a percentage of a page from ever being used and force more page splits.

    Unless I am missing the point outlined here...

    The fill factor is only valuable upon index rebuild or reorganize - such that after the index has been rebuilt or reorganized we leave a percentage of the page available for future inserts and updates.  Those future inserts and updates can then be done on existing pages instead of forcing a page split to allow for those changes.

    This is why it is so important to understand the data access patterns for the table in question - how the data is updated or new data inserted and the effects (or possible effects) of that access.  The goal is to allow the table to be accessed with minimal page splits between index rebuilds - reducing the number of page splits associated with that table during normal business operations.

    Rebuilding at 100% fill factor is useless because the index will just fragment as soon as data is inserted/updated.  Rebuilding at 80% is only useful for a highly transactional table - one that has a very high number of inserts/updates or a wide table that only allows a minimal number of rows.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • Both you and Peter are correct.  Normally, Inserts do not follow the Fill Factor.  They will almost always try to fill pages to as close to 100% full as the row lengths will allow.

    But, as with all else in SQL Server, "It Depends" and there actually is a case where the first Insert (regardless of the number of rows the Inserts has) WILL, in fact, only fill pages to the depth of the Fill Factor.  It's explained in the article I posted above.

    Why would you ever want to do such a thing?  Staging tables for imports where you know you're going to do "ExpAnsive" Updates or, perhaps, loading a partition.  It will save on having to Rebuild and index when you're done and, in the case of the staging table, it will increase the performance of your "ExpAnsive" Updates because it won't suffer from very "ExpEnsive" page splits during the updates.

    If you don't believe it, read the article and try the code.  I've not tried it in 2019 yet but it works in all versions of T-SQL that I had available when I wrote the article which are 2008, 2012, 2016 and, lately, 2017.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I can see the advantage of being able to pre-populate a table at the defined fill factor - for those rare cases where it is needed.  I have reviewed your article and was aware of it before, but this isn't about that condition.

    At one time - there was a great push to use identity columns as the clustered index...because it would avoid fragmentation and you could just set it to 100% fill factor and never have to re-index that table...ever.  Of course, we know that is only true for a table that receives sequential inserts in serializable transaction isolation - and is never deleted or updated.

    As soon as you insert multiple rows in a single batch and/or the insert goes parallel - you can cause fragmentation.  As soon as you perform an update that cannot be done in-place (ie: expansive updates) you will get fragmentation, and as soon as you delete a row your page density will decrease (not really fragmentation - but related).

    All of this comes back to the point - you need to understand how a table is accessed and used.  The clustered index needs to be defined to support how it will be utilized and not for how it will be stored.  If that means you need a composite clustered index...then that is what you build - and you then determine an appropriate fill factor to reduce how soon the table will suffer page splits...and schedule your index rebuild of that table to support those requirements.

     

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

  • All true except for one thing... DELETES can actually cause logical fragmentation if the deletes empty a page.  SQL Server is smart enough to rewire the double linking to the previous and next pages, which affects the previous logical page, the now empty page (which can now be reused), the next logical page.  Think of it as a "page-unsplit".  It's less expensive than a page-split because no row data is moved but it still ties up 3 pages (and the related pages in the B-Tree) in a system transaction (and all affected indexes) and, since the deleted page may no longer refer to the next physical page, that gets counted as logical fragmentation, as well.

    Shifting gears a bit, if the pages are only partially emptied, REORGANIZE will try to "compress" pages back up to the Fill Factor in a manner that is much more expensive to the log file than any rebuild even at logical fragmentation levels as only 12% (real life example f0r me on a 147 GB clustered index that only suffers "ever-increasing" inserts and deletes and used 227GB of log file).

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • p.s.  The reason why I brought the subject up is because the "condition" we're talking about is that Inserts don't follow the Fill Factor but there is that one exception that will take you totally by surprise if you're not expecting it.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeffrey Williams wrote:

    At one time - there was a great push to use identity columns as the clustered index...because it would avoid fragmentation and you could just set it to 100% fill factor and never have to re-index that table...ever.  Of course, we know that is only true for a table that receives sequential inserts in serializable transaction isolation - and is never deleted or updated.

    I remember those days!  That was a hard lesson.  We designed every table with a clustered identity as the the PK.  The "busy" tables were in the nightly rebuild/fragment/rebuild/fragment cycle.  We kept reducing the fill factor.  Maintenance took forever.  Through a lot of trial and error, we discovered the causes.

    Michael L John
    If you assassinate a DBA, would you pull a trigger?
    To properly post on a forum:
    http://www.sqlservercentral.com/articles/61537/

  • Jeff Moden wrote:

    p.s.  The reason why I brought the subject up is because the "condition" we're talking about is that Inserts don't follow the Fill Factor but there is that one exception that will take you totally by surprise if you're not expecting it.

    Understood - and yes, it can be a surprise if you run into it and are not expecting it.

    Jeffrey Williams
    Problems are opportunities brilliantly disguised as insurmountable obstacles.

    How to post questions to get better answers faster
    Managing Transaction Logs

Viewing 9 posts - 31 through 38 (of 38 total)

You must be logged in to reply to this topic. Login to reply