The Best High Availability

  • Steve, in the editorial you say: "Clustering in SQL Server works well" but then point out that it can't cope with a disc failure.

    What percentage of outages would ascribe to a server failure, as opposed to a disc failure? To my mind, this is the single most mis-leading feature I have ever seen in SQL server.

    Throw away your pocket calculators; visit www.calcResult.com
  • There's a difference between disk failure and a storage failure. Most shared storage is RAID, often with a spare drive, and so I think the disk issues that affect the server are rare.

    Same with corruption. It's often a hardware issue, but it does happen.

    I'd be curious too, how many times a server issues causes a failover. Personally I've rarely had SQL Servers fail, but you could have memory issues, which does happen. I could see someone forcing a failover to "recover" merory on the server.

  • the majority of our failovers were storage related. For a number of years we were running a RAID before moving to SAN which has been better. The RAID never actually lost data but DID go down: interface card failure, disk failure (yes it's not supposed to but tell that to the equipment-- an electirically malfunctioning drive can interfere with the operation of the overall system), quorum file corruption. The problem is that a shared storage system is a single point of failure, one that can often be avoided with mirroring.

    ...

    -- FORTRAN manual for Xerox Computers --

  • Steve Jones - Editor (4/30/2009)


    ... I think the disk issues that affect the server are rare.

    I agree, but we are talking about comparitive numbers - in my experience, the shared external storage infrastructure, (or even internal, non-shared storage) is more likely to suffer a failure than the server itself. One always has to try and consider every possibility, but the more likely a scenario is, the more weight has to be given to mitigation. I can't see how you would set up a system that was able to survive a storage outage, without it also being able to survive a server outage, so what's the point?

    Throw away your pocket calculators; visit www.calcResult.com
  • It's a good point. I would be interested to know if there are some surveys anywhere.

    May try to run one here.

  • thirumalai (4/29/2009)


    SQL 2005 Data mirroring (Synchronous) with Log shipping sounds to be a good HA solution, when compared to the clustering. Data mirroring is for HA and Log shipping is for DR (Disaster Recovery).

    While we test our application in clustering environment, we found that the failover took around 4 minutes of time. From the technical people perspective, this 4 minutes time is not acceptable and it is not HA and they added that the failover should not exceed 20 seconds of time. Also the clustering requires the shared disk. There are various types of shared disks like SAN, NAS available in the market, which are very expensive. Then we dropped the clustering solution and went for Data Mirroring (Synchronous) with Log shipping solution.

    In SQL 2005 Data Mirroring (Synchronous), the transaction gets committed both in the primary database server and in the secondary database server simultaneously. Hence on failure of the primary database server, our application will to connect to the secondary database server immediately, with out any manual intervention, since both primary database and secondary database will be in sync at any time. This failover happened around 20 seconds of time.

    We also use mirroring as our primary HA/DR solution here. I've not used clustering, or read much about it. What does it give you that mirroring doesn't?

    The Redneck DBA

  • Mirroring is by database, doesn't move logins, jobs, etc. Clustering is by instance.

  • Ah, good point. In our case we care more about not losing data than we do a couple of min. of downtime, so mirroring works out swell.

    The Redneck DBA

  • Hello Everyone,

    Has anyone looked at 3rd party software for HA like Double Take, XOSoft or Sonasafe? We are in the initial stages of HA and it seem that there is confusion all around, even with SQL contractors.

    We are looking at a 2 stage approach. 1) HA within the building then 2) HA to the DR site.

    Comments?

    Rudy

    Rudy

  • We used DoubleTake a few years ago, but found it to be buggy. Since (at least the version we were using) is basically glorified file replication on the database files at the byte level (not the transaction level) if something hiccuped in the process it sometimes wasn't able to make things right again and the result was a corrupt/useless remote set of database files.

    But that was several versions of DoubleTake ago, so it may have improved since then.

    We are currently using DB mirroring (asynchronous) from 20 offices all over the country (including east and west coast, and north and sough) to our corporate office in KC, and it's keeping up great.

    The Redneck DBA

  • oops worng thread

    ...

    -- FORTRAN manual for Xerox Computers --

  • I've supported log shipping and mirroring as HA solutions in the past. At my current position, we implement clustering and I'm a huge fan. The failover is quick, painless, and automatic. Disk failure? Implement disk mirroring with a RAID configuration and make sure someone monitors the hardware.

    If your situation is really, really complex, the hire an HA consultant to cover your asterick! 😀

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply