What causes Database Mirror to failover

  • I have a number of databases on a SQL 2008 Server that are mirrored to another SQL Server. The other night, 3 of the databases failed over to the Mirror, while the others stayed on the principal. I didn't see any kind of information in the Logs to indicate why those 3 would have failed over. Any ideas where I could look? And why just 3 databases failed over instead of all of them?

    Thanks

  • database mirroring is done at the database level.

    that said, i have seen many cases where a single instance that has many databases all mirrored start splitting up the way you describe. some of the reasons i have seen;

    1. long running query or hung process eats up to much resouces and the witness sees the lack of response on the principal a reason to failover.

    2. an outside process causing high CPU and disk usage, which again, tricks the witness into causing a failover because it thinks the principal is un-responsive. (think anti virus scan, 3rd party backup, etc..)

    3. network blips. any kind of DNS outage will cause this to occur. (personally have dealt with this many, many times)

    4. index rebuilds. (have seen this a handful of times) it is just a resource issue.

    Since it happens at night, think maintenance plans and other "over night" process.

  • Do you have a witness on those databases?

    The list above is pretty good. I would search through the SQL server error log and look for anything that occurred right before the failover.

  • he has too. the failover could not have occured with out it.

  • Yes, there is a witness. Thank you for this list. I will run through it and see what I find. It usually occurs in the middle of the night, which is when index rebuilds are done, so this might be it.

  • If it is due to the index rebuilds, you can change the length of time mirroring waits for before failing over. The default is 10 seconds. See here for info. That talks about timeouts in terms of clusters failing over, but the same thing might apply in your case. If a process is causing your server to be too busy to respond for 10 seconds, maybe increasing the timeout will help.

    Shaun

  • shaun.stuart (10/1/2010)


    If it is due to the index rebuilds, you can change the length of time mirroring waits for before failing over. The default is 10 seconds. See here for info. That talks about timeouts in terms of clusters failing over, but the same thing might apply in your case. If a process is causing your server to be too busy to respond for 10 seconds, maybe increasing the timeout will help.

    Shaun

    generally not a recommended practice as changing the timeout value also changes that value for a 'real' failover event..

    when index rebuilds got in the way at one of my last assignments, i added T-SQL code directly in front of and after the re-index job. the code basically paused the mirror, re-index occurs, resume the mirror.

    by pausing the session, you stop the flow of logs which will essentially lets the mirror sit and relax for awhile. after you resume, the logs will start to replay, but that is ok. I found that a large replay plus the time to re-index on a paused session is faster than re-indexing on a running mirroring session.

  • Good point - yes that would increase the time required before a real failover occurred.

    Interesting that you assumed he was reindexing the mirrored database. I assumed the opposite - that the reindexing was occurring in some other database and causing general system lag. In re-reading the OP's posts, it's not clear which case he is in.

  • shaun.stuart (10/1/2010)


    Good point - yes that would increase the time required before a real failover occurred.

    Interesting that you assumed he was reindexing the mirrored database. I assumed the opposite - that the reindexing was occurring in some other database and causing general system lag. In re-reading the OP's posts, it's not clear which case he is in.

    its a fairly safe assumption considering maintenance plans run on the instance level 😉

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply