Multi-Site Always On Failover Cluster Question for Experts

  • Good Morning All:

    I have a question for SQL experts.

    We are implementing a multi-site (Windows Server Failover Cluster) WSFC to enable Always On between our primary and DR site. We are not going to use SQL clustered instances. We are not planning to use shared disks. Each node is running a standalone instance of SQL 2012.

    I have successfully configured a 3 node multi-site Windows failover cluster with no shared storage. For quorum, I have defined a File Share Witness (FSW). The FSW has voting rights and is in the DR site. The setup looks like this –

    WSFC –

    • Node A – Site #1 (voting right = 1)

    • Node B – Site #1 (voting right = 1)

    • Node C – Site #2 (voting right = 0)

    • FSW – Site #2 (voting right = 1)

    Again - There are no shared disks in our setup. We are not going to use SQL clustered instance. We are going to use Always On with these 3 nodes.

    SQL Always On –

    • Node A – Site #1 (Primary Replica)

    • Node B – Site #1 (Readable Secondary)

    • Node C – Site #2 (Readable Secondary)

    All the setup including the “availability group” works properly under this setup. However, a failover to site #2 under DR situation is not working and I know why but don’t know what needs to be done to fix the problem.

    The following works fine –

    • Automatic failover between nodes A and B (same site – site #1)

    • Forced failover to node C in site #2 provided at least one of the nodes in site #1 is up (non – DR situation) - this will ensure the cluster is up

    The following is not working –

    • Forced failover to node C in site #3 when both nodes in site #1 are lost (true DR situation) – This is because the cluster is not up at this point.

    I know I have to bring the cluster up somehow and I have not been able to do so by restarting the cluster service.

    I tried to run the command to start cluster service.

    Question –

    How can I FORCE the cluster to come up in Site #2 on node C when it has no voting rights?

    I have always worked with even number of nodes and shared disks with traditional clustering. I am not sure what needs to be done in this scenario with 3 nodes and a FSW.

    Please let me know.

  • If I give voting rights to all 3 nodes and the FSW, then i know the cluster will come up on site #2 but what happens when site #1 is back online? How do you avoid split brain or will that not be an issue since node C will have lock on the FSW?

  • Will this work guys?

    Total number of votes = 4

    • Node A – Site #1 (voting right = 1)

    • Node B – Site #1 (voting right = 1)

    • Node C – Site #2 (voting right = 1)

    • FSW – Site #3 (voting right = 1) - lets say its in site #3

    Site #1 goes down or loses connection to Site #2 and #3 - we are still operational as quorum is maintained with 2 votes.

    Site #2 goes down or loses connection to Site #1 and #3 - we are still operational as quorum is maintained with 3 votes.

    Site #3 goes down and loses connection to site #1 and #2 - we are still operational as quorum is maintained.

  • Libby1981 (10/6/2014)


    Good Morning All:

    I have a question for SQL experts.

    We are implementing a multi-site (Windows Server Failover Cluster) WSFC to enable Always On between our primary and DR site. We are not going to use SQL clustered instances. We are not planning to use shared disks. Each node is running a standalone instance of SQL 2012.

    Fine, that's a good example of an AlwaysOn Availability group setup, you are removing the storage single point of failure.

    Libby1981 (10/6/2014)


    I have successfully configured a 3 node multi-site Windows failover cluster with no shared storage. For quorum, I have defined a File Share Witness (FSW). The FSW has voting rights and is in the DR site. The setup looks like this –

    WSFC –

    • Node A – Site #1 (voting right = 1)

    • Node B – Site #1 (voting right = 1)

    • Node C – Site #2 (voting right = 0)

    • FSW – Site #2 (voting right = 1)

    If you have a 3 node cluster you should not be specifying a witness of any kind, you should be using Majority Node Set only.

    Looking at the voting above, the new Dynamic Node Weight protocol has automatically removed a vote from Node C to keep the Majority within the required number. Remove the FSW, it's not required.

    Libby1981 (10/6/2014)


    Again - There are no shared disks in our setup. We are not going to use SQL clustered instance. We are going to use Always On with these 3 nodes.

    SQL Always On –

    • Node A – Site #1 (Primary Replica)

    • Node B – Site #1 (Readable Secondary)

    • Node C – Site #2 (Readable Secondary)

    All the setup including the “availability group” works properly under this setup. However, a failover to site #2 under DR situation is not working and I know why but don’t know what needs to be done to fix the problem.

    The following works fine –

    • Automatic failover between nodes A and B (same site – site #1)

    • Forced failover to node C in site #2 provided at least one of the nodes in site #1 is up (non – DR situation) - this will ensure the cluster is up

    The following is not working –

    • Forced failover to node C in site #3 when both nodes in site #1 are lost (true DR situation) – This is because the cluster is not up at this point.

    I know I have to bring the cluster up somehow and I have not been able to do so by restarting the cluster service.

    I tried to run the command to start cluster service.

    Question –

    How can I FORCE the cluster to come up in Site #2 on node C when it has no voting rights?

    I have always worked with even number of nodes and shared disks with traditional clustering. I am not sure what needs to be done in this scenario with 3 nodes and a FSW.

    Please let me know.

    Remove the FSW!!

    -----------------------------------------------------------------------------------------------------------

    "Ya can't make an omelette without breaking just a few eggs" 😉

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply