Rebuild or Reorganize index

  • I am using SQL server 2008.

    The DB is normally used from 9 am-9 pm and the maximum db size is 10 GB.

    Whether rebuild index offline causes any problem, if it is done during midnight (when no one is using db)?

    or reorganize is enough for index maintenance in this case?

  • Either should work fine. And with that small size, I'm tempted to say that it will work fine, even if you do nothing at all, assuming that the database is on SSD. Or for that matter, you have RAM enough to fit the database.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • You shouldn't routinely rebuild or reorganize indexes just because you can.

    Far more important is to insure that you have the best clustered index on every existing table.  Naturally that requires some analysis.

    As part of that, you review all existing indexes as well.  Often many nonclus indexes can be removed once you get the proper clus index.

    If you are on Enterprise Edition, not likely but possible, you should consider compressing the data as well.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • I'll get quite a bit of argument from a lot of folks but I've found that REORGANIZE frequently does NOT do a very good job at fixing page density problems because it simply doesn't work the way people think it does.  Yep... it'll fix logical order but that won't last long because it doesn't fix page density problems nor does it make any room between the assigned Fill Factor and 100% full.  Rather, it compresses the pages up to the Fill Factor.

    Another way of describing that action is that it removes critical freespace just when it's needed the most and that perpetuates page splits, which is the primary cause of both logical fragmentation and "physical" fragmentation, which is the cause of low page density.

    It can also easily cause extreme usage of the log file.  For example, I had (it's grown since the test) a 146GB Clustered Index that was only 12% logically fragmented, which was restored from a backup on prod to a test box.  When I used REORGANIZE on it, it caused my log file to grow from 20GB to 227GB, which is bigger than the index by a whole lot!  Reseting the test with more restores, a REBUILD of the same index in the FULL Recovery Model "only" consumed about 146GB of log file (as expected) and took about the same amount of time as the REORGANIZE.  Doing the same test but in the BULK LOGGED Recovery Model (if you can use BULK LOGGED), only caused the log file to grow from 20 to 37GB and only took 12 MINUTES instead of the hour and 21 minutes the other two methods took.

    I'll also tell you that you NEVER want to use REORGANIZE on indexes that are based on Random GUIDs, period.

    For me, I'll just never use REORGANIZE unless it's to compress LOBs and it even sucks at that.  In one experiment, it took 10 runs or REORGANIZE to get some out of row LOBs back in decent shape (which doesn't actually need to be done) and never fixed the problems that in row lobs cause because it can't.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Jeff Moden wrote:

    It can also easily cause extreme usage of the log file.  For example, I had (it's grown since the test) a 146GB Clustered Index that was only 12% logically fragmented, which was restored from a backup on prod to a test box.  When I used REORGANIZE on it, it caused my log file to grow from 20GB to 227GB,

    Then you did not have any transaction-log backups going at the same time. With REBUILD (unless you run it as resumable) the entire operation is a single transaction, and you cannot keep down the log file. REORGANIZE works with a lot of small transactions, so there are possibilities to keep the log in check.

    But, yes, depending in which way your index is disorganised, REORGANIZE can mean a lot of operations, as it performs a bubble sort of the file.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Way back in the day, Brad Macgeehee did a bunch of testing around reorganize and rebuild. What he found is that the majority of the time, reorganize almost didn't do anything good for the index at all, yet, cost about the same in terms of processing time & resources. This is especially true since, even if it does result in a few pages less on the index, the statistics aren't updated at the end of the process.

    Personally, I'd never reorganize any indexes. I would only, ever, rebuild or not. And for rowstore indexes, I'd lean towards not rebuilding at all (based on Jeff's research actually). Columnstore (not an issue here) is different. Rebuilds are needed occasionally to clear out the delta store there.

    If all you're going to do is reorganize, then make sure you also have a plan for statistics maintenance since you won't be getting any stats updates.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Erland Sommarskog wrote:

    Jeff Moden wrote:

    It can also easily cause extreme usage of the log file.  For example, I had (it's grown since the test) a 146GB Clustered Index that was only 12% logically fragmented, which was restored from a backup on prod to a test box.  When I used REORGANIZE on it, it caused my log file to grow from 20GB to 227GB,

    Then you did not have any transaction-log backups going at the same time. With REBUILD (unless you run it as resumable) the entire operation is a single transaction, and you cannot keep down the log file. REORGANIZE works with a lot of small transactions, so there are possibilities to keep the log in check.

    But, yes, depending in which way your index is disorganised, REORGANIZE can mean a lot of operations, as it performs a bubble sort of the file.

    True enough but my point is that was a measure of how much extra load you're going to have on the backup system and the storage system not to mention that all that logging can cause slowdowns in other areas especially if the log file is being used for some form of replicaiton.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I've found REORGANIZE is often useful when a lot of rows have been deleted / purged from a table.  In that situation, I don't want the other pages messed with anyway, I just want the now-empty pages removed as efficiently as possible.  I, frankly, try to avoid a full REBUILD in those situations when I can.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • That was the exact scenario I had with the log file blow out that I described.  It was an ever increasing index that suffered only from DELETEs.  It wasn't an efficient process for that table.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • REORGANIZE is not meant to reduce total / overall I/O, rather it's meant to do it in chunks instead on in bulk, as REBUILD must do it.  You're expecting REORG to do something it was never intended to do.

    I agree it should not be run routinely.  Nor should REBUILD, as we all agree.

    Also, I must say, since most tables are "designed" (pseudo-designed at best) by developers, they are not close to properly normalized, and are thus excessively wide.  Excessively wide tables have all sorts of additional problems with reorgs and/or rebuilds, as one would expect.  That is a factor as well.

    SQL DBA,SQL Server MVP(07, 08, 09) "Money can't buy you happiness." Maybe so, but it can make your unhappiness a LOT more comfortable!

  • ScottPletcher wrote:

    REORGANIZE is not meant to reduce total / overall I/O...

    Yep... Totally agree... that's one of the many reasons why I avoid it like the plague. 😀

    Totally agreed on the rest.

    The exception to the rule for not routinely running REBUILD is if you have indexes that have a Random GUID as a lead column and some NCI's that are part of a table that have a CI that is based on a Random GUID.  For those, you actually need to rebuild when logical fragmentation exceeds just 1%.  And, it has to be a rebuild, not a reorganize.  If you do that and you have the Fill Factor correctly set for the influx of new rows and minor "ExpAnsive" updates, you can literally go for months with absolutely ZERO page splits (not even supposed good ones).

    As with all else in SQL Server, "It Depends" and nothing is a panacea... well except for maybe avoiding the use of Reorganize. 😀

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Another thing, I that I did not observe until today, which is a quite important difference is this: If you have any form or snapshot enabled (and that includes if you have a readable secondary in your AG), and you have a table where rows are frequently updated, there is reason to say away from REBUILD here and go with REORGANIZE. To wit, the 14-byte pointers to the version store which are added on updates (and deletes) are removed if you rebuild the index, so you will get fragmentation and page splits quite directly (unless you accommodate for it with a lower fill factor). With REORGANIZE the pointers are retained.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

  • Erland Sommarskog wrote:

    Another thing, I that I did not observe until today, which is a quite important difference is this: If you have any form or snapshot enabled (and that includes if you have a readable secondary in your AG), and you have a table where rows are frequently updated, there is reason to say away from REBUILD here and go with REORGANIZE. To wit, the 14-byte pointers to the version store which are added on updates (and deletes) are removed if you rebuild the index, so you will get fragmentation and page splits quite directly (unless you accommodate for it with a lower fill factor). With REORGANIZE the pointers are retained.

    Lordy.  MS didn't do us any favors there.  Thanks for the tip, Erland.

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Erland... Before I spend a lot of time looking for it, do you have a link handy that talks about that problem?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • No, I don't have a link. I just played with a reasonably big table. As a starting point I had this output from sp_spaceused.

    name	rows	 reserved	data	        index_size	unused
    Orders 1032105 301952 KB 199632 KB 101792 KB 528 KB

    I ran:

    ALTER DATABASE Northgale SET ALLOW_SNAPSHOT_ISOLATION ON

    Then I did:

    UPDATE Orders

    SET Freight *= 1.2

    The table now grew to over 500 MB. I also ran:

    SELECT * FROM sys.dm_db_index_physical_stats(db_id(), object_id('Orders'), 1, NULL, 'DETAILED')

    Which showed that there were a million in version row count. (Note, you need SQL 2019 for to have sys.dm_db_index_physical_stats to give you information about version records.)

    Then I ran

    ALTER INDEX PK_Orders ON Orders REBUILD

    And sp_spaceused now reported the original size, and sys.dm_db_index_physical_stats reported no version records.

    With REORGANIZE that did not happen.

    I was quite surprised, because I had been taken for granted all these years that once these 14-byte pointers had been added they would remain as long as the database had some form of versioning enabled. But in a way it makes sense. If you take a big table like an Order table, most of the rows just sits there, and the 14-byte pointers would only eat disk space.

    [font="Times New Roman"]Erland Sommarskog, SQL Server MVP, www.sommarskog.se[/font]

Viewing 15 posts - 1 through 15 (of 20 total)

You must be logged in to reply to this topic. Login to reply