Validating storage performance

  • I've had a few problems recently on new and existing SQL servers where performance doesn't "appear" to be up to scratch for the quality of infrastructure its based on. I get this impression from the general response of the server, the time it takes to do simple file copies, backups and restores, and the values for response time reported by the Windows resource monitor and SQL activity monitor (anything between 100ms and 500ms - really bad).

    But this is all a bit vague, and to engage the storage or server engineers I need to provide absolute and reliable figures that demonstrate there is definitely a problem with the infrastructure.

    I've spent a long time digging around for good advice on this, but keep coming across comments that tools like SQLIOSIM only tests the reliability of your storage and not its performance. Does anyone have any advice on the best tools or methods (cost not a problem) which will allow me to ask a storage engineer "What response time and peak throughput should this server get on a good day" and when they say "anything under 15ms, and anything around 400 to 500Mb/s" I can then run a series of tests that confirm or deny this?

  • You can view the sys.dm_os_waiting_tasks on a periodic basis and see what waits are going on and document them over time to determine if there are a lot of waits on disk resources. Also, if you are gathering perfmon data you can use the PAL utility to show you how often you are going over common recommended thresholds for things like CPU, disk, network, etc.

    Also, SQLIOSIM is for reproducing common OLTP activity against a storage subsystem. SQLIO is for benchmarking performance. Have you looked into that?

    Performance Analysis of Logs (PAL) Tool[/url]

    SQL Server Best Practices Article

    Joie Andrew
    "Since 1982"

  • Thanks - a colleague's just pointed out the difference between SQLIOSIM and SQLIO which I hadn't previously appreciated. I'll look in to that PAL too.

  • Just remember, SQLIO is a stress test tool. If you're talking about validating your production system, you will be applying a load to your disks that is very likely to impact performance of your servers. Be cautious. If you're worried about them and don't want to impact them directly, then standard investigation might be a better approach. Take a look at sys.dm_os_wait_stats to see what is causing the system to run slow as a starting point. If that indicates disk, usually through page i/o latches of various types, then you do have a problem there and you can pursue it further with the system admins.

    ----------------------------------------------------The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood... Theodore RooseveltThe Scary DBAAuthor of: SQL Server 2017 Query Performance Tuning, 5th Edition and SQL Server Execution Plans, 3rd EditionProduct Evangelist for Red Gate Software

  • Thanks, but in this particular case (I have a few servers on different sites I'm worried about storage performance!), I have a new server that's being prepared for production. Consequently it's under no load unless I generate that load with a stress test tool -the DMVs are all empty of useful data.

    I'm expected to hand over the server fit for purpose and I'm not happy that it is since some simple tasks (backup/restore being one I can easily replicate) take longer on this server than they do on another running off the same SAN.

    So somehow I'll need to find a window to stress test the server when other impacted servers won't mind. My biggest problem at the moment is that the only way I can currently put the server under load is to run a backup and a restore. SQLIO should give me another option. I just need the most reliable and repeatable method to run a test that gives meaningful output and can be compared to another server.

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply