The Importance of IOPs in Shared Storage Environments

We recently helped a client determine what to do with an aging Fiber Chanel infrastructure.  Cisco had announced end-of-life status for the existing FC switches and the replacement switches had a list-price of $144,000!  Even with their education discount, the replacement hardware would have been well north of $100,000.  All of this just to access 4.5TB of 8 year old fiber channel disk.

After examining the typical use patterns for the Fiber Channel disk (MSSQL Server, MS Exchange, VMWare server shared storage), I suggested that perhaps it would be less expensive to commoditize the transport mechanism by purchasing new iSCSI storage rather than replacing the FC switches.  This solution would both eliminate the need to spend $100,000 on a pair of fiber channel switches and would allow the customer to avoid spending approximately $12,000 per year to maintain the aging fiber channel disk itself.  “Perhaps a set of Dell MD3000i’s would meet your needs?”, I suggested.

If that was the end of the decision, this would be a fairly boring blog post.  BUT, the on-site DBA (who maintains the MSSQL infrastructure that connects to the FC disk storage) was concerned about performance on “commodity” iSCSI storage.  Would the 15K SAS drives in an MD3000i and their embedded iSCSI controller be able to keep up with his blazingly fast database cluster?  “I dunno.  Let’s buy a set and find out.” Yeah, that definitely wouldn’t pass muster.  Time to break out that math degree and work some calculations.

Disk Performance, Good! (or, an IOP is not a place to buy pancakes)

When analyzing application performance for I/O intense applications like databases, mail servers and virtualization environments it is important to consider the IOPs rating for your storage environment.  Simply put, an IOP is an I/O Operation and is generally measured in I/O Operations per second (IOPs).  Calculating the IOPs rating for a single drive is quite simple as long as you have the right data:

Drive:  Dell Branded 15K 146G SAS
Average Latency:  2ms (0.002 sec)
Average Seek Time:  4ms (0.004 sec)

IOPs = 1/(0.002 + 0.004) = ~167 IOPs

Okay, great…but our array’s first shelf has 15 drives in it (14 usable, you did configure a hot spare right?).  Well, if we configured our RAID array in JBOD (RAID0) then we could simply multiply our IOPs per drive by the number of drives.  In this case:

Array IOPs in  JBOD = 167 IOPs per drive x 14 drives = 2,338 "Raw" IOPs per shelf

The MD3000i can be expanded with two additional shelves full of drives (MD1000’s).  This adds an additional 28 usable drives, bringing a fully populated array (using the above drives) to a total  of ~7000 IOPs.

It Never RAIDs But It Pours

But our astute reader may observe that only the most foolhardy sysadmin would ever consider storing critical data on JBOD disk.  We need some sort of RAID protection.  But, how do we factor in the performance overhead of RAID?  To answer this question, we need a basic understanding of how the various RAID levels work.  Regardless of what RAID level we are using a read operation will only require 1 I/O operation.  A single write operation, on the other hand, will require significantly more drive operations depending on the RAID level (see below).  So, as always, we need to balance our need for performance against our need for storage capacity.

RAID 0:  1 read, 1 write
RAID 1 (or RAID 10):  1 read, 2 writes
RAID 5:  1 read, 4 writes
RAID 6:  1 read, 6 writes

So how many IOP thingies do I need?

To determine our application’s needed “raw” IOPs rating, we need to have a good understanding of our application’s read/write distribution.  Then we can use this formula:

(Total Workload IOPS * Percentage of workload that is read operations) + (Total Workload IOPS * Percentage of workload that is read operations * RAID IO Penalty) = Needed “Raw” IOPS

For a typical MS Exchange implementation, Microsoft recommends that you plan for two reads for every one write (or 66% read, 33% write).  And for an “average light user” Microsoft estimates 500 IOPs (per 1000 mailboxes).  Based on a RAID6 (worst-case performance) scenario, our disk needs to perform as follows:

(500 * .66) + (500 * .33 * 6) = 1320 IOPs per 1000 mailboxes

This may come as quite a shock for some sysadmins.  An application that only requires 500 IOPs at the system level will need nearly three times that volume of IOPs at the storage level.

Final Storage Recommendation

So, will a set of MD3000i/MD1000 combos be able to replace our client’s venerable Fiber Channel installation?  Well, from an IOP perspective the performance is the same at the drive level:  the 15K SAS drives from Dell portend to perform at 167 IOPs per drive, while the legacy 15K FC drives from IBM are rated at 174 IOPs.  When the numbers are this close, it becomes practically a wash.  It all comes down to the RAID level that is configured on the new array.  And since the iSCSI array can be accessed using the existing Ethernet data network the client can avoid shelling out $100,000 for a new set of fiber channel switches.  So, yes, it would seem that iSCSI wins again.

3 Responses to “The Importance of IOPs in Shared Storage Environments”

  1. Scott Lowe says:

    (Full disclosure: I work for EMC Corporation.)

    Good article on the importance of IOPS and how to calculate them. This is a part of storage sizing that is often overlooked, especially in virtualized environments. However, your discussion seems less to be a comparison of Fibre Channel vs. iSCSI as an access protocol and more a comparison of array vendors and drive types. I tend to tell customers to focus less on the access protocol (FC vs. FCoE vs. iSCSI) and instead focus on getting the right multiprotocol array on the back end. Thanks!

  2. jfiske says:

    Scott, thanks for the kind words. For most of my clients (higher-ed and non-profits) a FC SAN is far too costly to justify, now that iSCSI technologies have matured. Even a fully-meshed Ethernet environment that is physically separate from a client’s data network can be accomplished at 10% of the cost of a comparable FC transport environment. No need for a multiprotocol array when all of your commodity equipment can already speak Ethernet/iSCSI out of the box.

Leave a Reply