Cluter Instance fail overed???

  • Hi,

    We have some network related issues and it caused the sql instances to fail over to passive node.

    We have 3 node a/p/a cluster setup.

    I moved the sql instance group from passive node to active node from Cluster administrator. Here I have couple of questions:

    1.Is the SQL Server insatnce restart every time we move the groups from one node to other? Because I'm seeing in the error log that the sql server instance has restarted. Is this normal behavior?

    2.We have active/passive/active cluster setup. Node1 should failover to node2 and node 3 should failover to node2 but NOT node1 to node3 and node3 to node1.We have lost network communication on node3 and it moved to node2 and node1 also moved to nod2. Later node1 instance moved back to node1 from node2. But node3 instance is still on node2. It did not move to its preferred node node3?

    please advice...

    3. We have a backup group called Backup on node3. When the netwok communication is lost, sql group is moved node2 But the Backup group did not and all the backups of the sql instance are failing with the below error:

    BackupDiskFile::OpenMedia: Backup device 'Z:\\FULL\BACKUP\Prod_06-24-2009.bak' failed to open. Operating system error 21(The device is not ready.)

    what should I do to fail over this Backup group(contails only the Backup drive Z) along with sql group when ever failover is initiated inorder to avoid the Backup failures

    4. Node 1 drives are visible from node3. Is this normal? before failover, node1 drives will not be visible from node3. Is this normal?or what should I do not to see node1 drives from node2?

  • 1. Yes, the service is stopped on one node and started on the other.

    2. I don't know why node 1 moved as well if it was a failure on node 3 but it is true to say that if it fails over, you have to manually fail it back to its original location, it doesn't do this automatically.

    3. I don't understand why you would design it this way? Back it up to a UNC path or to the shared SAN drive. Don't see what this acheives other than causing your issue.

    4. When you say drives I assume you mean SAN drives, where your SQL data and logs are? This is totally normal as it wouldn't work otherwise. If you mean local drives then they are just mappings.

  • Thanks,

    I don't understand why you would design it this way? Back it up to a UNC path or to the shared SAN drive. Don't see what this acheives other than causing your issue.

    We have a backup group in the cluster administrator and the only resource inside this group is the Backup drive. I able move this group to other node manually. But when the sql group moves(when fail over initiated), the Backup group is not moving. I tried to add the backup drive as a dependency on SQL Service so that it it can fail over along with sql instances when failover occurs. Bit I did not find the option to add the backup drive(which is in backup group) as dependency. I clicked the properties of the SQL Server(ins1) in Resources(in left panes of cluster administrator)->dependencies-> modify->here I should see the backup drive to add as dependencies BUT it is not there.

    How can I achieve this?

  • No idea sorry. I wouldn't do this as there is no advantage to having the backup drive available to one resource at a time. What is your thinking here?

  • So, You want me to add the Back drive in SQL group too??? that's the you configured?

    Right now the server is in production. If I want to add the backup drive to SQL Group and remove the Backup group what are the steps I need to follow?

    please help me in this?

  • I will elaborate my scenario"

    1. We have a 3 node active/passive/active cluster setup. Node1(A), Node2(P), and Node3(A)

    2. Node1 should fail over to node2 and node3 should fail to node3.But not node1 to node3 and node3 to node1

    3. We have a sql group(which has drives D, E, F, and T for data files, secondary data files, log files and Tempdb respectively.)

    4. We have a backup group(which has drive Z for storing the databases backup )

    5.We have backup script like below. A backup procedure which backups the given database to the specified backup path with the given backup type(full,diff or log). for example

    exe backup_sp 'db_name', 'Z:\Backups\Full\', 'Full' ----lets assume this is on node 1.

    6. Fail over occured, and the sql group is moved to node2(P) but the Backup group did not and the backups failed. How to make this backup group fail over to node2?

    7.Sql group->properties->failback-> preven failback

    -> allow failback

    which option should be selected(prevent failback or allow failback) according to Best practise?

    in our case, we selected allow failback. Please advice...

  • 1.Is the SQL Server insatnce restart every time we move the groups from one node to other? Because I'm seeing in the error log that the sql server instance has restarted. Is this normal behavior?

    Yes it is normal .Each node will have its own SQL binaries.When one node fails over the Service control manager first shuts down the instance .Infact failover happens after shutdown because once its shutdown the isalivecheck fails and initiates the failover.

    2.We have active/passive/active cluster setup. Node1 should failover to node2 and node 3 should failover to node2 but NOT node1 to node3 and node3 to node1.We have lost network communication on node3 and it moved to node2 and node1 also moved to nod2. Later node1 instance moved back to node1 from node2. But node3 instance is still on node2. It did not move to its preferred node node3?

    please advice...

    Thats not possible .Someone must have moved SQL group from node 2 to node 1 .It cannot happen on its own .

    3. We have a backup group called Backup on node3. When the netwok communication is lost, sql group is moved node2 But the Backup group did not and all the backups of the sql instance are failing with the below error:

    BackupDiskFile::OpenMedia: Backup device 'Z:\\FULL\BACKUP\Prod_06-24-2009.bak' failed to open. Operating system error 21(The device is not ready.)

    your backup group will not be cluster aware even though you have created it as a cluster resource.Only cluster aware services can failover automatically like MSDTC resource.

    4. Node 1 drives are visible from node3. Is this normal? before failover, node1 drives will not be visible from node3. Is this normal?or what should I do not to see node1 drives from node2?

    Well Disks are shared .the disks are only visible on the node SQL Server instance is running .Even if its visible on the node its not running and you double click it it will give you an access denied error.

    Abhay Chaudhary
    Sr.DBA (MCITP/MCTS :SQL Server 2005/2008 ,OCP 9i)

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply