Are there issues with ONLINE INDEX REBUILD

  • Hi

    Since now we rebuild indexes only within maintenance frames every 3 to 6 months at system downtime. The database is about 100GB and some of the tables hold up to 300,000,000 rows. For sure this causes not the best performance.

    I asked my DBA why not rebuild online every evening within the backup jobs when the system load is low. He told me that there are issues with online rebuilds and checkpoints which may cause that the transaction log will not be cleared after backup. The only way to fix this is to set the database to simple recovery and back to full.

    Do you know this issue? If yes, is there any workaround?

    Thanks!

    Flo

    Edited: Replaced "shinked" with "cleared"

  • The way hte online rebuild works, as I understand it, is that the system effectively takes a copy of the current index, and uses that for queries while rebuilding the index in temp (write only) and applying any changes to that as it's rebuilt.

    So you need to ensure that Temp has enough room to cater for the index duplication - ideally you should have enough space there so that the temp database doesn't have to grow, obviously. But I don't get the bit about the transaction logs not shrinking after backup being a problem. Are the t-logs shrunk as part of the maintenence plan? If so - why? If you shrink them, then they're probably just going to have to grow again with the associated performance issues in relation to assigning the additional space and fragmentation.

    I'm a DBA.
    I'm not paid to solve problems. I'm paid to prevent them.

  • Hi Andrew

    Thanks for your reply!

    The disk space should not be the problem. The databases (all together) on this server are currently about 500GB and the available storage is 2TB.

    Sorry for the confusion. The problem is not that the T-LOG doesn't shrink but it doesn't become cleared and new transactions will be appended at the end (which affects that it grows and grows). He also says that there are still (until SSE2k5) problems with data which are changed (insert/delete/update) since the rebuild is working. The system has a 24/7 work load but less at night.

    I corrected the initial post and replaced "shrink" with "cleared".

    Greets

    Flo

  • Florian Reischl (5/9/2009)


    He told me that there are issues with online rebuilds and checkpoints which may cause that the transaction log will not be cleared after backup. The only way to fix this is to set the database to simple recovery and back to full.

    Online or offline, an index rebuild is a single transaction. The tran log cannot be cleared while there is an open transaction. To be more specific, when a log is cleared, by a checkpoint in simple recovery or a log backup in full/bulk logged, it will mark as reusable any VLF (virtual log file) that is completely inactive. If there's an open transaction, then the log cannot be truncated past that point those log records are still active and hence the log can only be truncated to the beginning of the VLF that contains the beginning of the oldest open transaction. (that's ignoring things like replication, mirroring, CDC)

    Simple recovery does not change that behaviour. I'm also confused about the DBA's statement about checkpoints and the 'solution' being to switch to simple. In full recovery, a checkpoint does not truncate the log. Only a transaction log backup does that. It's only in simple recovery where a checkpoint truncates the log.

    My guess as to what he's seeing has more to do with bulk operations. Index rebuilds are minimally logged in bulk-logged and simple recovery. That means a lot less impact on the transaction log. Hence, in simple (or in bulk logged) the log will grow a lot less with index rebuilds.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail

    Thanks for your response. The backup would run after the index rebuilds. In his words, the following would be the (usually working) serial scenario:

    1. Online index rebuild(s)

    2. Backup database

    3. Backup transaction logs

    4. Result: t-log files are cleared/reusable and will be used for next transactions.

    DBA:

    Usually this works fine. Sometimes there stay some "dirty bytes" within the t-log after the index rebuild(s) and the t-log backup (after the index rebuilds) will not mark the t-log file as reusable. So the next transactions will be written after the logged index-rebuild(s) and the file becomes up to twice size. This is not 100% reproducable but happens sometimes. Even the next backup on next day doesn't free the space within the t-logs.

    The above paragraph was written with his words. Hope this was understandable.

    Greets

    Flo

  • Not something I've ever heard of, not something I can find any reference to on google.

    Ask him if, next time it happens, he can check a few things:

    log_reuse_wait_descr in sys.databases

    DBCC OpenTran

    the details of the spid returned by DBCC OpenTran (if any)

    If it's not a long-held open transaction, then it sounds like it might be a bug .

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Hi Gail

    I'll tell him tomorrow and keep your tips in mind! That was what I've been looking for.

    Thanks

    Flo

  • We have the facility to rebuild indexes in sql server 2005. However, it is the best practice to do it offline to avoid locks on the tables incurred during index rebuild.

    Thanks

    Vivek

  • viveks.dba (5/11/2009)


    However, it is the best practice to do it offline to avoid locks on the tables incurred during index rebuild.

    You sure about that?

    An offline index rebuild takes locks on the table for the duration of the rebuild. That's why it's called offline, the index is unavailable for the duration of the rebuild.

    An online rebuild takes a short Schema lock at the beginning of the rebuild and a short Schema lock at the end of the rebuild, but does not otherwise take locks on the table.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • Florian Reischl (5/10/2009)


    Hi Gail

    DBA:

    Usually this works fine. Sometimes there stay some "dirty bytes" within the t-log after the index rebuild(s) and the t-log backup (after the index rebuilds) will not mark the t-log file as reusable. So the next transactions will be written after the logged index-rebuild(s) and the file becomes up to twice size. This is not 100% reproducable but happens sometimes. Even the next backup on next day doesn't free the space within the t-logs.

    The above paragraph was written with his words. Hope this was understandable.

    Greets

    Flo

    The DBA's name wouldn't be Nigel by any chance would it? I only say that because I met a DBA once (Nigel) who swore blind that this table, containing over 800 million rows, didn't need a clustered index. He then proceeded to try and blind me with pseudo tech babble that didn't really fly....The above quote sounds very much like this guy.

    Anyway, on to the topic of online index rebuilds... you'll be OK running online index rebuilds. I've run them in various situations and apart from a slightly longer rebuild time I've never encountered any problems with them. And any problems you may encounter with them are surely over-shadowed by the fact that you are only rebuilding indexes every 3 months - now that's a problem in need of addressing.

  • Hi Karl

    Thanks for your feedback. No his name is not Nigel 😉

    I will inform him that there shouldn't be any problems with online index rebuilds.

    Greets

    Flo

  • OK, I just want to ask my question a little more clearly and to the point:

    I have a 500 GB DB. It needs to be reindexed. I want to use online reindexing. Can I...

    1. ...leave the DB in FULL recovery mode?

    2. ...continue to do transaction log backups every 30 min?

    3. ...not worry about it filling up my transaction log and crashing the DB, the reindexing process, and all my users?

    Thanks!

    William

  • wjones21 (8/31/2009)


    1. ...leave the DB in FULL recovery mode?

    You can, but....

    2. ...continue to do transaction log backups every 30 min?

    Yes

    3. ...not worry about it filling up my transaction log and crashing the DB, the reindexing process, and all my users?

    Probably not. Index rebuilds, whether online or offline are fully logged in full recovery.

    Bulk-logged recovery may be an idea. Index rebuilds are bulk-logged in recovery models other than full. Just be sure to take a log backup after switching back to full.

    Gail Shaw
    Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
    SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability

    We walk in the dark places no others will enter
    We stand on the bridge and no one may pass
  • My company is obviously concerned about the 10 hour+ exposure without a backup that an online reindexing process will lead to with the DB in SIMPLE recovery mode. The way I've done it in the past was to set recovery mode to simple and go without a backup for the duration. That's no longer an option.

    So what are my options for performing online reindexing without exposing ourselves to potential data loss for the duration of the process?

    Oh, you cannot perform a trans log backup after you've set the DB to SIMPLE. You need to set the DB back to FULL recovery mode and then do a FULL Backup, then start your transaction log backup schedule again.

    William

  • wjones21 (8/31/2009)


    My company is obviously concerned about the 10 hour+ exposure without a backup

    Why 10+ Hours? Unless your hardware is on the low end and with slow disks it shouldn't take that long!

    We do a weekly online reindex on our 900GB DB and it takes about 2 hours or so, depending on how long we need to pause it because the db mirror queue gets over 10GB (and we wait until it gets below that via 10 minute wait loops). We also change the T-Log backup to every 5 minutes so it doesn't grow to big during this time period.

    _______________________________________________________________________
    For better assistance in answering your questions, click here[/url]

Viewing 15 posts - 1 through 15 (of 26 total)

You must be logged in to reply to this topic. Login to reply