Blog Post

The Importance of Verifying Database Backups Regularly

,

Yesterday, as I was pulling into my driveway, my front left tire gave. Thankfully, I was going really slow and I was right at my house. I did the requisite email to my boss that I was going to be in late and the check to see what time the tire place was open, but I left changing the spare tire until this morning because it was cold (for where I live) and it was dark (almost 9 PM). This morning when I got up, I pulled out the spare and it was... flat. The last time I had pulled out the spare was over 10 years ago (I've been very thankful my car has stayed in good running order for quite a while now). I had never checked it since. Brilliant.

Thankfully, it inflated just fine (thank you, air compressor). But it got me to thinking about back in November, when my mother-in-law's tire gave out. She had used her spare more recently than I had used mine... it had only been a couple of years. And it, too, was flat. That same air compressor (in my wife's mini-van) came in handy that day, too. And you know, a few months before that, my wife had a flat. However, her spare was inflated. The van was relatively new, but we had never checked the spare tire on that car, either.

In case you were wondering, you should check your spare tire's air pressure every time you check your regular tires. Oh, you have one of those cars that keeps track of air pressure for you? That would be a problem. But if you're like most of the rest of us, we're not checking our tires as often as we should, either. We're supposed to be checking 'em once a month. And the recommendation is to use a gauge you know is accurate. Stay away from the gauges at the gas station.

So what does this have to do with database backups? Backups are like spare tires. We don't think about the spare tire. We assume it'll be fine when we'll need it. It never occurs to most of us to ensure it doesn't go flat. But what if we're on the side of the road and we go to pull it out and it's flat (which is where my wife lost her tire)? Then we're stuck. Database backups are often the same way. The job indicates the backup was successfully taken. And we consider it good. We might use WITH CHECKSUM as a BACKUP option in SQL Server 2005 or follow the backup command with RESTORE VERIFYONLY in any of the versions of SQL Server, but this doesn't prove 100% that the backup is good. I have seen a backup pass RESTORE VERIFYONLY yet fail to restore correctly. Maybe that was a once in a lifetime event, but unless I check every backup, how do I know?

The obvious answer is I don't. And truth be told, few of us have time to check every single backup. But we should be in a habit to regularly check backups on a server set aside for the purpose of restoring those backups and verifying they work. Through scripting, a lot of it can be automated. After all, before Microsoft released SQL Server 2000, folks had figured out how to log ship via script on SQL Server 7. Some carried this over to SQL Server 2000 because they didn't want to buy the price for Enterprise when Standard and their own scripts worked just fine. Now, if you're dealing with VLDBs, simply carting over the backup and doing the restore may not be so simple. Perhaps there are other means you've got of duplicating the data such as mirroring or replication. But at some point, those duplicate sources should have some reconciliation checks run against the primary source.

What's the cost if we don't? For the DBA, it may be a trip to the unemployment line. In the case where technical services are provided on a contractual basis, it may be a termination of said contract either when the contract expires (not renewing) or sooner by citing breach of contract. I've seen both cases recently with some colleagues who have picked up work as a result. Customers have come to them because the contracted DBAs could not restore data based on the SLA when a hardware failure occurred. In one case, the last good backup went back not hours or days but weeks. There were backups that were newer, but they were all corrupt. My colleagues understand that since both customers have recently experienced data loss, they won't be hesitant to pull the plug should the new service provider make the same mistake.

The moral of the story? Check your tires and your database backups regularly.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating