RAID 5 drive faliure, confusion

  • Hi, am new to SQL server. We recently had a drive failure on on of our SQL servers. SQLdump kept on putting out a backup file, each day, but this file had the same name. Our backup process kept on copying the file and so didn't generate an error. By the time we realised what was happening, the server had died. I was wondering if anyone might be able to shine some light on what exactly happened? Apart from the obvious 🙂

    Thanks.

  • I have nothing to add as to the cause but I'm going to suggest you not rely on the backup process you described. You can use the Maintenance Plan wizard to walk you through setting up backups for your databases.

  • Disagree. Maintenance plans are better than nothing, but T-SQL backup scripts are much better. They are portable and easily reproducable. Plus maitnenance plans have had a lot of bugs over the years (of course, so can a poorly written T-SQL procedure).

    What was the backup command being used to back up the database?

    1 of 2 things happened:

    1) the backup was being created with the INIT option which means it over-wrote the backup file each time and the file only contains the most recent backup.

    2) the backup was being created with the NOINIT option (this is the default if neither option is specified) which means that each backup was appended to the existing backup file. In this case, the backup file will contain many subsequent backups.

    In order to see which backups are contained in a backup file, use the RESTORE HEADERONLY command. This will list all of the backups in the file and tell you the backup file number (POSITION column) of each. To restore a specific backup in the file, you need to identify the file number of the backup you want using the RESTORE HEADERONLY command and then use the FILE option in the restore command.

    For example the following code will create 2 backups of the Adventureworks database to the same file and then restore the 2nd backup in the file:

    Backup Database Adventureworks

    To Disk='c:\bak\Adventureworks_test.bak'

    With NoInit;

    Go

    Backup Database Adventureworks

    To Disk='c:\bak\Adventureworks_test.bak'

    With NoInit;

    Go

    Restore HeaderOnly

    From Disk='c:\bak\Adventureworks_test.bak';

    Go

    Restore Database Adventureworks

    From Disk='c:\bak\Adventureworks_test.bak'

    With File = 2,

    Replace;


    My blog: SQL Soldier[/url]
    SQL Server Best Practices:
    SQL Server Best Practices
    Twitter: @SQLSoldier
    My book: Pro SQL Server 2008 Mirroring[/url]
    Microsoft Certified Master: SQL Server, Data Platform MVP
    Database Engineer at BlueMountain Capital Management[/url]

  • jamie.worsley (3/6/2012)


    Hi, am new to SQL server. We recently had a drive failure on on of our SQL servers. SQLdump kept on putting out a backup file, each day, but this file had the same name. Our backup process kept on copying the file and so didn't generate an error. By the time we realised what was happening, the server had died. I was wondering if anyone might be able to shine some light on what exactly happened? Apart from the obvious 🙂

    Thanks.

    How did the server finally die? I'm trying to figure out what the backup and drive failure of a RAID 5 set have to do with each other?

  • Thanks for the answers. Another HD died, raid 5 can't work after 2 drive failures. The sql error had been happening since xmas this year, not sure if the two problems where related.

  • jamie.worsley (4/4/2012)


    Thanks for the answers. Another HD died, raid 5 can't work after 2 drive failures. The sql error had been happening since xmas this year, not sure if the two problems where related.

    You are correct, A second hard drive failure will crash a RAID 5 system. There are other issues here other than a backup failing, which at this point has nothing to do with your HDD failures. I would be more concerned at this point why the failed drive wasn't replaced shortly after the failure. Had it been replace, the RAID 5 could have rebuilt the lost drive, probably before the second drive failed.

    Now, about the backups, what error message(s) were you getting from it?

  • No error message, thats why we didn't notice. It seems we wheren't monitoring the back ups close enough. Just making sure that the date on the file was up to date, not that the date in the filename was matching the date the file was created.

  • Please realize that the backup failure and HDD failures are separate issues, they have NOTHING to do with each other. Yes, someone failed to notice a problem with the backups, but someone else failed notice a hardware failure.

  • Yes I understand now, what you mean.

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply