How do YOU recover system DBs after a server restore?

  • larry Hennig (3/13/2009)


    LOL! I just realized "Ten Centuries" is a descriptor, not a handle. DOH!:D

    It now seems clear that I need to be prepared for a REBUILD or re-install in the event that I do not have copies of system DB files for the required build, and that means keeping track of build levels and install options for each instance (and to think our relatively undisciplined server group thought they could manage SQL Server without any DBAs involvement!).

    😛 .. Same here but I manage to get by looking after about 100 instances ;-). Well actually here its more "SQL is easy, anyone can do it!".

    One complication in doing this is that we have some app+database servers that vendors have direct access to, and they have applied patches without our consent (I do not have the hammer needed to stop this, so I have to live with it).

    Over my dead body? No 3rd party vendor gets access to SQL Servers directly; I don't care how good they are. Hell I don't even let Microsoft touch my servers when I call them for troubleshoot; they can watch and direct. I might give them control if its something difficult to explain for a bit... but they can NEVER DO anything unsuperised.

    As far as keeping the copies current ...

    I am going to look into using SyncBak or something like it to update the copies of the system DB files whenever the build level changes.

    I plan to code a job (or buy a tool) to alert us to changes in build numbers so we can keep a close eye on the situation and will have forensic evidence if something breaks as a result. This can also help me keep the reinstallation documentation up to date.

    Our WSUS updates are applied once a month. Each server will be forced to reboot at that time regardless of whether WSUS demands a reboot, so I am thinking of using a boot-time script to update the copies.

    I am confident my fellow DBAs will remember to refresh the copies when we manually apply updates, so it should be feasible to ensure the copies are kept current, and that it will be a rare situation when I need to re-install SQL Server to restore a server.

    I just needed to know whether or not that effort would be a waste of time, and aside from opinions on whether or not a re-install is less work than the above, I think it is clear that the effort is worthwhile.

    Thanks for the input!

    I don't like idea of backing up mdf/ldf files, creates too much confusion ;-). I like my bak files. You can try this with your current set ...

    ** NOTE: I only did it once .. and I used it to recovery login information only *_* ....

    Take your master.bak file restore it on another server with different dbname... and then detach the files.

    ---> untested step <--- Copy the files to your server and rename the file to master.mdf and master.ldf and see if your server reconizes it...

    Thanks.

    [font="Arial"]---

    Mohit K. Gupta, MCITP: Database Administrator (2005), My Blog, Twitter: @SQLCAN[/url].
    Microsoft FTE - SQL Server PFE

    * Some time its the search that counts, not the finding...
    * I didn't think so, but if I was wrong, I was wrong. I'd rather do something, and make a mistake than be frightened and be doing nothing. :smooooth:[/font]

    How to ask for help .. Read Best Practices here[/url].

  • That's a nice little trick to have in one's back pocket!

    I know a similar trick you can apply to a log shipping target database to verify it's integrity, create local backups and make it available as a DSS database:

    Create a new database with the same logical filenames. Stop SQL server. Copy the files of the log shipping target over the files of the new database. Restart SQL Server. Recover the new database (roll back uncommitted transactions). Now you can run an integrity check, create a full backup, and use it as a read-only database without fear of interrupting the log shipping process.

    I'd like to say "over my dead body", but that's what my two predecessors said, and I like having a defined benefits pension plan that is backed by a group of utility companies - especially in today's economic climate.

    Oops, I've gone off-topic. Thanks for the tip!

  • At my workplace, I have made our SQL Server DRs fairly trivial by scripting the DBMS installs, having standardized backup and restore procedures and a proc to generate all necessary file structures. The process basically goes like this:

    new server with OS installed provided by server folks along with our backup directories and all db backup files restored from Tivoli.

    run vbscript which installs SQL instance based on an ini file

    - this ini file contains all necessary info for install

    - this script also applies the appropriate SP and hotfixes based on ini file

    - this script creates a standard DBA admin databases and a whole mess of custom sp_dba* procs we put in every server

    - after this we have an completely operational instance without the user databases

    run vbscript which restores master and msdb databases

    - this script also runs an stored proc which will read all the data file locations from master and create all the necessary directory structures for the user database files

    run a stored proc which generates all the restore statements for the user databases based on the last full backup, diff backup and Tx logs taken.(getting this info from msdb) This proc also reaches out to the filesystem to ensure that the proper backup files all exist.

    run the generated restore SQL.

    Using this process, we are able to rebuild and restore a quite a few instances in a relatively short period of time.

    We do annual DR tests using our second data center and in last year's test, it took over two days to rebuild and restore 9 instances of Sybase on AIX on one RS6000 and about 12 hours to do over 30 instances of SQL across 8 intel boxes. And this was with 3 DBAs doing the Sybase rebuilds and 1 DBA doing the SQL.

    The keys to the speed of the SQL DR are driving the entire install process and instance recover process via vbscript and having very standardized backup routines.

    As well, the first instance that I recover is our DBA production instance where we have a database that contains all the necessary information about each server, instance, database in our shop. This data is maintained automatically using a .NET app that once a day polls every known server for any changes and then enumerates all servers on the network to see if new ones exist. With this database being the first one up, I can then confirm that the ini files for each instance is correct before installing the instance.

    The most important thing about doing a DR is being prepared to do it in the first place. We put in quite a bit of work building up the necessary tools on the SQL side and then getting all the existing instances in line with our standards, but it has definitely been worth it.

    As of last year, it takes longer for the server folks to setup the OS's on the servers than it does for us to get all the instances restored to their points of recovery. The server folks are working on pretty much eliminating having to install OSs by using vmware and a product called PlateSpin which will keep the offline vmware images synced with the production servers. They did a proof of concept last year with a few boxes and in those cases, all I had to do was restore the databases.

    FYI. We have a DBA team of nine people, but during DR tests we go with only four to simulate what would more than likely be the case in an emergency DR.

    The downside of getting DR efficient is now the business has almost doubled the number of systems designated as "critical" and is even considering adding the dev and test boxes in. That has the potential of including over 150 instances of SQL and about 35 instances of Sybase. (sure hope they don't go that far.)

    Ian Dundas
    Senior IT Analyst - Database
    Manitoba Public Insurance Corp.

  • OK, this is untested but I believe it should work. I will try to find the time to test it soon but feel free to have a go yourself.

    The reason rebuild says reapply all your patches is because the rebuild just copies the RTM version of mssqlsystemresource (henceforth called resource) database off the install disk. For every upgrade this database is merely copied over with a new one containing updated system objects, so your resource database is out of date after the rebuild.

    So rather than reapply fixes, which you likely cannot because the binaries you have recovered already reflect these fixes, copy your saved up to date resource database files over the one created by the rebuild and then go straight to the restore of the master,model,msdb system databases.

    This means you have to have an up to date copy of the resource database, so as this database cannot be backed up in the conventional manner and only changes when an upgrade is done, take a copy of these files at every upgrade. Make sure these files are saved away by your offsiting software used to rebuild your DR servers. ALL system databases should be backed up so this should be part of any backup strategy anyway.

    This is a reason to go to your windows admins and tell them just using WSUS in a conventional manner is not good enough for SQL boxes.

    Also Larry (or grasshopper if you prefer 🙂 ) I am sorry to hear you work for a company that gets rid of DBAs who are trying to protect their data and favour outside vendors over their own staff. Perhaps you can give management this as a reason for disallowing vendors patching your servers , but its your career so far be it from me to advise on that!

    ---------------------------------------------------------------------

  • Ian,

    Thanks for your feedback. I think you offer a concrete case for the type of standardization I am hoping to achieve. (One of the first things I did was script the SS2005 SP2 installation). My biggest hurdle in moving forward is that we do not (yet) have a large enough inventory of instances to make the up-front scripting effort a clear pay-off. May I quote those numbers to our management?

    George,

    That's a good idea. If I remember correctly, Mssqlsystemresource is backed up by TSM, so I can preserve it locally just before doing the rebuild. I am also going to talk to my boss about not using WSUS for SQL Server updates.

    My thanks to both of you!

  • I have tested copying back a saved away latest copy of the resources database to bring your instance up to date with patches and it works. I had some fun and games with rebuilddatabase but more of that later.

    To 'corrupt' my instance (SQL 2005 SP3) and mimic the sort of situation you would be in at a DR where the binaries are ready to go but system databases files are missing, I stopped the instance and moved the master ,model and msdb files elsewhere.

    I then ran the reinstall command:

    start /wait setup.exe /qd instancename= MSSQLSERVER reinstall=SQL_Engine rebuilddatabase=1 sapwd=somethingsecure

    Interesting thing here, as I had neglected to move the resource database files out of the way, the reinstall did NOT overwrite the existing resource database, so having done the reinstall SQL was already at SP3 and there was no prompt to to upgrade the instance!

    So if you have a filecopy of the resource database saved away you could COPY it to the correct location before you run reinstall and you are good to go for the master database restore as soon as reinstall finishes, or you can run reinstall, giving you the RTM version of resource database, then overwrite that with your saved, up to date copy.

    I did have some problems with the reinstall command, which is all the more reason not to depend on it but have file copies of all your system database files ready to copy into place in the event of DR. running the re-install I was hitting error 1706:

    'Try the installation again using a valid copy of the installation package 'SqlRun_SQL.msi'.

    turns out that as I install from a mapped network drive I cannot guarantee I am installing from the exact location as the original install (drive letter could differ) and this throws out the reinstall. Fix was found in this msdn thread:

    http://social.msdn.microsoft.com/Forums/en-US/sqlsetupandupgrade/thread/ba4f3005-f7cb-46a8-b2ee-acdc483e6fcc/

    so I had to run this command to reset the source location

    start /wait microsoft\SQL\2005\CD1\setup.exe /qd instancename= MSSQLSERVER reinstall=SQL_Engine REINSTALLMODE=v

    where microsoft\SQL\2005\CD1 is the current location of the install media in my case, and then re-run the reinstall rebuilddatabase.

    This shows a recurring weakness in MS documentation in that it does not always tell the full story or reflect real life situations, in this case that the reinstall should be run from the exact same location as the original install, and that the reinstall only really works well if you are at RTM. All this is info that should be held in the same place, you should not be expected to go and research these problems after you hit them!

    So to sum up, IMHO relying totally on reinstall rebuilddatabase is not without risk, and you should always have file copies of all the system databases available to you, and indeed just copying these back to the correct location is the fastest way to recover from lost or corrupted system databases.

    ---------------------------------------------------------------------

  • So if you have a filecopy of the resource database saved away

    ...

    a recurring weakness in MS documentation in that it does not always tell the full story or reflect real life situations, in this case that the reinstall should be run from the exact same location as the original install, and that the reinstall only really works well if you are at RTM.

    ...

    always have file copies of all the system databases available to you, and indeed just copying these back to the correct location is the fastest way to recover from lost or corrupted system databases.

    That's valuable information about the resource database and rebuild option. Thanks for doing those tests!

    I started looking at SyncBack as a means of creating the copies of the system databases without stopping SQL Server, but the free version is not licensed for commercial use and we have a purchase freeze in place, so I am looking for another option and am reading up on VSS and VSSWriter to see whether it would be difficult to write a small utility in C# or VBScript. I tried some searches in hope that someone else has done this, but did not find anything (I might not have done a good job with the search terms). I'll post anything I find that is useful.

  • start /wait setup.exe /qd instancename= MSSQLSERVER reinstall=SQL_Engine rebuilddatabase=1 sapwd=somethingsecure

    What is the rebuilddatabase?

  • look up running setup from the command line in BOL, but basically you are telling the setup program you want to replace the system database files (In the case of master being corrupt or lost and thus preventing SQL from even starting)

    ---------------------------------------------------------------------

  • Thanks that the new command in 2005 for rebuilding master file........

  • yep, its the replacement for rebuildm.exe in SQL2000.

    I prefer not to be totally reliant on either of them!

    ---------------------------------------------------------------------

  • Interesting thing here, as I had neglected to move the resource database files out of the way, the reinstall did NOT overwrite the existing resource database, so having done the reinstall SQL was already at SP3 and there was no prompt to to upgrade the instance!

    Is this the mssqlsystemresource.ldf and mssqlsystemresource.mdf files.

    So if you have a filecopy of the resource database saved away you could COPY it to the correct location before you run reinstall and you are good to go for the master database restore as soon as reinstall finishes, or you can run reinstall, giving you the RTM version of resource database, then overwrite that with your saved, up to date copy.

    Got a little lost of your statement here.

    All the document is if you lost the master files because of no BACKUPS.

    If you have the backups and just moving them to another SQL Server - do you have this documented with the command prompts.

    I am going to try both exercises....we on a Cluster so i assume this is the same steps.

  • TRACEY (3/21/2009)


    Interesting thing here, as I had neglected to move the resource database files out of the way, the reinstall did NOT overwrite the existing resource database, so having done the reinstall SQL was already at SP3 and there was no prompt to to upgrade the instance!

    Is this the mssqlsystemresource.ldf and mssqlsystemresource.mdf files.

    yes

    So if you have a filecopy of the resource database saved away you could COPY it to the correct location before you run reinstall and you are good to go for the master database restore as soon as reinstall finishes, or you can run reinstall, giving you the RTM version of resource database, then overwrite that with your saved, up to date copy.

    Got a little lost of your statement here.

    just saying if you are doing a reinstall with the rebuilddatabase option you have the option of either putting your resource database files in place before or afterwards, either will get you to the same situation in the end.

    All the document is if you lost the master files because of no BACKUPS.

    nope, its for when system database files are corrupted or lost so you cannot even stsrt SQL in order to be able to run a restore of you backups.

    If you have the backups and just moving them to another SQL Server - do you have this documented with the command prompts.

    I am going to try both exercises....we on a Cluster so i assume this is the same steps.

    If you have backups and SQL will start then start SQL in single user mode

    net start mssqlserver /m

    and restore master from cammand line

    SQLCMD -Sservername -E -Q"restore database master from disk = 'directory\master.bak' with replace"

    SQL will close then restart in normal way

    you will then need to update sysservers if server has a different name. check BOL , this may vary for a cluster. PLEASE DO NOT practice on your production server!

    edited SQLcmd statement

    ---------------------------------------------------------------------

  • Thank you for the clarification.

    Don't practice on production oh thats not fun 😀

    No very good advice never pratice on production.

    I will play around on my own SQL Server just to get a fill for it -if the worst was to happen. I have a lot documented but never actually tested it out.

    I will try building SQL on a difference server with separate paths too.......

    Cheers

  • tracey - previous post edited (SQLCMD)

    ---------------------------------------------------------------------

Viewing 15 posts - 16 through 30 (of 49 total)

You must be logged in to reply to this topic. Login to reply