Raid-5 Concept questions

  • I understand Raid 5 in general (striping several disks and one of the disks holds the "parity" info).

    Say I have a server which has several disks in a raid 5 configuration along with a "hot spare".

    Question 1.  What is a "hot spare"?

    Question 2.  If I lose a disk in the raid 5 setup, does SQL Server suffer down time while the "broken" disk is replaced?

    Question 3.  Is it a DBA Admin job to replace/fix the raid 5 array problem?

    TIA

    GaryA

     

  • Here's my take on your questions:

    1)  A hot spare is a drive that sits in the array doing absoluletly nothing.  It is there simply to jump in as a drive fails.  This is handled a hardware level by the RAID subsystem.

    2) Well, if there's a hot spare there's no real downtime why the "broken" disk is rebuilt.  If there isn't a hot space, there may be downtime while the server is shut down and a physical disk is replaced.  Many servers can handle online "hot-swap" of a drive.  No downtime at all.  There will be a performance hit as the parity drive is rebuilt.

    3) Absolutely not, it's the server admin's job.  Now, it is possible that the same person wears both hats.  If you have no server admin, it's probably best to call the hardware vendor for a service call. 

    Hope that helps!

  • Thank you James!

  • You should experience any downtime while the Hot spare is rebuilt, but there could be some performance issue's during that time.

    The IO subsystem will be very busy rebuilding the drive, so you MAY suffer so performance degradation.  Which could be nothing more than some timeouts etc.


    KlK

  • Here's how RAID5 and HotSpares work (fairly simplistic view).

    Raid 5 is a n+1 configuration. Meaning if you have 4 18GB disks in a RAID 5 set, you have 3x18GB space usable with 18GB worth of Parity information. This parity information is spread out over all four disks, so that if any one disk fails, the other three disks know what was on the failed disk, and you don't experience any downtime. (some much older implementations of RAID 5 sets used a specific disk for Parity, but I would be suprised if you see one of those)

    If a disk fails, you need to replace it. Otherwise, if an additional disk in your RAID set fails, the volume will be unavailable (the drive letter will disappear from Windows) and have to be restored from tape, etc. When you replace the disk, the RAID controller looks at the original three disks and rebuilds the failed drive, complete with Parity info. There is no downtime, just reduced performance while the info is copied, etc.

    A HotSpare disk is just a disk sitting around waiting to automatically replace the original failed disk. You still need to go and replace the original failed disk, then put the HotSpare back in a spareset (waiting to sub in again).

    If everything is set up correctly and you catch the failed disk quickly, you should never have downtime - thats the whole purpose of RAID (well, Raid 1 through Raid 1+0 anyway). Raid 0 is a stripeset with no parity info. One failed disk and you are SOL.

    Hope this helps,

    Chris

     

  • Raid 5 is nice because it allows you to lose one disk without losing data. Two disks you lose everything. UNLESS, you have a hot spare. Then you can lose an extra disk for each hot spare and not lose data.

    I have two 14 drive arrays (73GB each drive). I have one hot spare. I can lose two drives and not have a problem. If I lose three, my backups had better be good.

    As already said, hot spare means you can swap out drives while the power is on.

    So, I a drive and the hot spare takes over. It gets built from the 'parity' markers on the rest of the drives.

    Now I replace the broken drive, while the power is still on and everything is running. Depending upon the software, the new drive will either become the hot spare OR the hot spare will copy it's information over to the new drive and revert back to a hot spare.

    I've used both setups.

    Is this a DBA job? Well that depends. I work VERY closely with my system admin. When the drives fail, I am usually the one that goes ahead and calls tech support to order new ones and I'm the one that does the installation. It's not 'really' my job, but I can help out the SA by doing it.

    -SQLBill

  • Hello,

         Not to pick nits, but Hot Swap means you can swap out drives with the power on (there's hot swap PCI, Memory, Power Supplies, Hard Drives, etc.) At least on Compaq (now HP) hardware, anything with a red handle on it means that it is hot swap.

        Hot Spare is a drive already in the enclosure ready to 'spare' in and replace a bad/failed drive automatically. Different RAID controllers handle this in different ways. Some of the lower end controllers require you to associate a hot spare with a particular raidset. The higher end stuff (I've mainly worked with Compaq HSG, EMA, and EVA Raid systems) allows you to set up a hot spare(s) for the entire enclosure. You set them up to spare in prioritized by Size or Performance of the drives.

     

    Chris

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply