Storage Array - Disks & controllers

  • I am in the process of redesigning our disk system. We have around 300 GB database on RAID 5 drives that have long disk wait time.

    The mgmt has given a go ahead to get a SAN and configure with RAID 10. I remember some where reading about how to calculate the

    approx disk IO by the application, and convert that into how many disks you need in the array for that disk IO. This takes into

    account the RPM and reads/writes and IO in MB/sec each disk can do, and each controller can do and then decide the # of disks and

    controllers required. Does anyone else remember seeing the article, or followed this approach.

    My current approach is, on average we have ~30 disk queue so we need about 15 drives. And if we do raid 10, 30 drives (no matter

    the size of the drive, so what if we have free space).

    All inputs appreciated.

  • This is the right approach to determining how many disks to configure. The correct number of drives has little to do with total capacity when optimizing a server for OLTP databases.

    As you've already got this DB up and running, you don't need to approximate the i/o but instead can get the actual physical i/o from perfmon.

    It usually pays to get the fastest drives (now 15K rpm, soon 20K rpm), as then you can use fewer drives and therefore fewer enclosures, controllers, etc.

    For a 300GB database, I'd probably use direct-attach rather than a SAN unless you're planning to cluster. A SAN will cost more but won't be any faster, and you run the risk of having others decide to share the drive pool, which will hurt your performance, control, and manageability. SANs are often cost-justified by making the case for sharing the resource, but the performance numbers being touted by the vendor are for dedicated use.

    The average disk queue is a good rule of thumb. Remember, though, that reads on a RAID 10 will use all the disks in a properly designed system that supports split seeks, and writes are not forced to be sequential as with RAID 5. Faster drives can support longer queues without bottlenecking.

    There are a lot of parameters, but if you want the actual formulas, I can dig them up. I now spec mostly from experience with analysis of perfmon logs.

    --Jonathan (HP Master Accredited System Engineer - SQL Server)

    http://www.education.hp.com/proliant-mase-sql.htm



    --Jonathan

  • Thanks for the information. The reason I am asking for a SAN is, besides one or two main servers, we have about a dozen servers each with a few of hundread GB. We would like to consolidate those into a few servers (2 may be 3) and then aggregate the storage in one location. Having to deal with versions, patches, hot fixes, perf issues and monitoring on a dozen servers has been a real pain. We have always had persistent disk performance problems since most of the transactions are like DSS type involving 100 thousands of records at a time. There was proliferation of servers, when the hardware was a limitation, but now with 8 way servers being mainstream, there is no reason to have this many servers.

    Few other concerns I have are whether I should carve a few separate arrays for eash server (separate for DB, Logs, Tempdb) on the SAN Or make few arrays and slice and share them among the servers. I am leaning towards the first approach (independent arrays for servers), since some of our processes load / and read data into / from multiple servers at the same time so I am concerned with having them share the array may hinder the performance.

    Would you please share the parameters and actual formulas. I would like to do due diligence, and do my best to do it right the first time.

    Thanks.

  • If you're determined to use a SAN, then get their sizing utility from the SAN vendor.

    Unless the databases are not subject to concurrent access, put OLTP and DSS databases on separate physical servers and separate physical disk arrays. If you will have two DSS servers with TBs of data and one OLTP server with a single 300GB database, I recommend using the SAN for the two DSS servers and sticking with direct-attached storage (DAS) for the OLTP data. That way your OLTP data will not be thrashing through the storage array cache. Carve the DSS servers' data volumes from a single RAID 5 stripeset. The performance gain from the larger number of spindles typically outweighs any disk contention among the databases. It can make sense to put the DSS servers' tempdbs on separate RAID 10 arrays; check this activity on your existing servers. Set the storage array cache to 100% read except when loading, when you'll reconfigure it to 100% write.

    With the OLTP server using DAS, it will have dedicated cache that will be closer (on a PCI-X controller) than with a SAN. Use two controllers, one for the RAID 1 log disks (with 100% write cache) and one for the RAID 10 data array (with, typically, 50% read/50% write cache). If you have multiple OLTP databases on this server, use a separate RAID 1 array for each log, but keep the data on the same RAID 10 array.

    To estimate the minimum number of disks for your database, you will need to know the ratio of reads to writes, the number of I/O requests per transaction, and the number of transactions per second desired. You also need to know the number of random I/Os per second your drives can support.

    Let's try an example. You monitor your OLTP server and find the ratio between reads and writes averages 2:1. You then determine the I/Os per transaction by monitoring perfmon while you apply a known transaction load to the database and find that each transaction takes 3 physical I/Os (and you verify the read/write ratio). RAID 10 uses one disk I/O per read and two per write (RAID 5 uses four per write). You're using the latest 15K rpm drives, which can handle 125 random 8KB I/Os per second. You're told the system needs to support up to 1000 transactions per second. This means you need enough disks to comfortably handle 1000*(2*1+1*2)=4000 I/Os per second. 4000/125=32 drives. The smallest drive available is 36GB, so this array will have 576GB available for your 300GB database, which should be enough.

    Buy 32 36GB drives and a couple of hot spares. Slide them into four 14-drive hot-plug cages. Connect each cage to one channel of your four-channel intelligent array controller with the maximum amount of battery-backed cache memory installed. Slide two 73GB drives into the server's internal hot-plug cage connected to another intelligent array controller. This is the log file array. Use another RAID 1 array for the OS, executables, and swap file.

    If you don't believe me about using DAS rather than a SAN, here's a quote from EMC's White Paper "Storage Technology Choices for Microsoft SQL Server 2000 with EMC CLARiiON Storage Systems":

    quote:


    Reducing disk space waste and consolidating storage provisioning and management come at a price, however. The obvious costs of Fibre Channel switches or directors are typically offset by the savings mentioned earlier. The hidden cost of sharing storage among SQL Server databases is the care that must be taken to ensure adequate performance is available to every database server. As more servers compete for storage resources, contention btween requests grows and latencies my increase. The storage manager must be vigilant in the provisioning of database storage."


    --Jonathan



    --Jonathan

  • Thanks for the analysis and the article. It has been useful.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply