Disaster Recovery

  • Recently I was involved with an off-site D/R. I spend almost 24 hours of sleepless hours there just doing simple restore from backup that my 12 years old son can do, too. This make me start think  why DBA need on the D/R  site to do this boring job while you may be more prodctive in the office or relax at home.  I really would like hear some professional's opinion. Here is what I thought,

    If S/A can restore everything that include registries (includes the ones for SQL) and SQL binaries. Why can DBA just wrote some scripts to recover the master  (with -m option) and the rest of the user databases?  Is there any loophole that I missed ?

  • You are most likely correct. However, that is provided that you have good documentation and that it is up-to-date. I take documentation and disaster recovery very seriously. Anytime anything changes in any of our many SQL Server environments I make sure my disaster recovery documentation has been updated and offsited. I not only rely on this documentation for disaster recovery purposes but for documentation of how things were installed, when dbs were created, how and where connection pools come from etc... This document will serve others in the event I should ever leave the company. I have screen shots of installing SQL Server, service packs, security patches. Then a master list of all prod and test servers, what dbs are on them, some output of basic scripts of database paths, what dbs are where, any odd things about the installation etc.... This is also helpful for servers that I don't touch much but then a year down the road we attempt to upgrade something. I tried to look at the documentation with the though that the person reading it is not me but someone that has to figure out how to recover a prod SQL Server and get it all up and running.

  • The big loopholes are the ones you won't think of. Haveing the DBA handy is the important thing since unforseen things might happen.

    Other than that, DR could be as simple as having the SA restore the system and you restore the DBs.

  • I think the answer depends on the type of environment you have and what type of disaster you have.   When all we need is a simple restore from tape, our system adminstrators do that - including database restores.  Our DBA's tell them specifically which backup to use and which database to restore to. 

    I try to do as much of my after-hours work from home as I can.  However, if your server is off the network, you can't reach it from home.   If the restore will take hours, you can leave and come back - or "tag team" it.  I try to avoid 24 hours in an uncomfortable setting - it makes for bad decisions when you are tired or uncomfortable. 

    When I helped design our disaster recovery plan, I created as many scripts as possible.  This allows us to set up a new server as quickly as possible, with as little thinking as possible.   As Steve Jones said, "unforseen things might happen" - I try to preserve my brain power for that!

    Danette Riviello

  • Another thing that one may forget about. If there IS a disaster who has key logons and passwords for connection pools to the databases? Are they recorded somewhere or does just one or two key people know them. I have as many as I can documented and offsite vaulted also.

  • There are basically 2 kinds of D/R, one is original DBA is available (H/W crash, Server crash ...) and the other is only new DBA is available (nature disaster, terror attack). The thinking logic for me is the later one (otherwise why you need go off-site ?).  For that situation, I agreed the documentation is the most important thing to anyone that will do the job. But, is it make more sense to script out everything to the extend that allow a person with adequate Windows knowledge can do the recovery ?

    I had been on the Unix environment for more than10 years, I can't recall that a DBA is really needed on the offsite D/R test. System Admin recovered everything to the opertable state, why we can't do the same on Windows world.

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply