RE: Storage array advice anyone?

  • From: DENNIS WILLIAMS <DWILLIAMS@xxxxxxxxxxxxx>
  • To: mzito@xxxxxxxxxxx, Oracle-L <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 16 Dec 2004 15:50:46 -0600

Matthew - I've felt there is a need for a book on this topic for several
years. Judging by your posts on this list, you are just the person to write
it. I'm looking forward to your book.

Dennis Williams
Lifetouch, Inc.

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Matthew Zito
Sent: Thursday, December 16, 2004 12:54 AM
To: Oracle-L
Subject: Re: Storage array advice anyone?

This is the sort of issue that comes up often on oracle-l - in a 
nutshell, "Is raid 5 acceptable for database workloads".  There's a lot 
of great writing that has been done on the subject, but the tragedy is 
that a lot of it is very very old, and could stand a rewrite.  I cover 
some of these topics in my forthcoming storage book from o'reilly 
(*plug* *plug*), but my overall opinion is that if you can afford the 
utilization penalty for RAID-10, then you should take it every time.

However, RAID-F is not the terrible thing its made out to be.  The 
things that, in my opinion, have significantly changed the landscape 
for parity-protected RAID levels:

-RAID-6 (aka RAID-DP) - basically adding an extra parity disk for very 
large RAID-5 sets to allow you to suffer three disk failures before 
data loss occurs (that is, data loss occurs on the third disk that 
-Virtualization/abstraction of storage objects - when the LUN you are 
sending I/Os to is comprised of chunks from 50 different spindles from 
10 different RAID-5 groups, the performance is excellent.  Another 
example of this is HSM allowing for infrequently used blocks to be 
"paged" out to RAID-5/6 devices with the high-performance blocks 
remaining on RAID-10.  Yet another example is "third-mirror" or BCV or 
Shadow Copy (whatever the vendor term of the week is) for a 
point-in-time copy of your database - but that addresses the 
recoverability issue, not so much the availability issue
-Predictive failure analysis - basically, most drives soft fail and 
throw errors before they hard fail (head crash, etc.).  More modern 
disk arrays will preemptively bring in a hot spare to replace a drive 
that has had more than a certain number of errors.  Reconstruction 
occurs directly from the dying disk until it is not responding properly 
-Hardware/ASIC based parity checksumming - performance improvement, 
plain and simple, due to pipelining and paralellization of parity 

There's really two arguments that seem to come up against RAID-F:

-Performance  - RAID-F is slow
-Availability - RAID-F is inherently less reliable

The performance argument simply doesn't stand up as an absolute 
anymore.  There's three reasons for that - RAID-5 implementations have 
gotten better, newer technologies like the ones I list above remove a 
lot of the shortcomings of RAID-5, and storage in general has gotten 
faster.  Many databases that I have seen were very carefully tuned for 
the specific array, best practices, logs and indexes and datafiles all 
on separate disks, etc. etc and would have been just as fast had they 
thrown everything onto one big volume and let the array sort it out.  
In fact, I see many organizations creating many small storage objects 
for various performance-driven purposes, when they were getting carved 
out of the same RAID group, rendering any benefit imaginary at best.  
RAID-5 may not always be as fast as RAID-10, but often it doesn't need 
to be.  Look at it this way - we'd all like to be running our databases 
on the biggest iron possible to improve performance, but we're forced 
to deal with the servers that are acceptable from a budgetary and 
management perspective.  The same is true of storage.

The availability argument is true, though with the above techniques 
things have been again mitigated with time.  The key issue, is, though 
- what is the desired/required availability for the database?  We would 
all like to have a 24/7 database that's as reliable as possible, but we 
all make decisions about where to cut corners for availability.  Many 
organizations trust their storage arrays to be redundant from an 
operating perspective, when most of the truly damaging outages I've 
seen in my time working with storage were due to array failure having 
nothing to do with any RAID group configuration.  For example, in most 
fibre channel arrays today, yanking an active disk drive has a 
reasonable probability of taking down the entire fibre channel loop, 
killing off up to 128 drives at once.  Yet very few organizations 
mirror across storage arrays online (though many mirror them remotely 
for DR).

The general argument FOR RAID-10 seems to be, "It's better, and it 
doesn't really cost THAT much more".  The fact remains, it does cost 
more, and can cost a great deal more than a RAID-F configuration, 
depending on group sizing.  For example, 140 disks in two different 
configurations - 10 14-drive RAID-10 sets and 10 14-drive RAID-6 sets 
(two reasonable standard configurations provided by EMC Clariion and 
Netapp NearStore).  With 73GB drives, RAID-10 nets you just a shade 
over 5TB, while RAID-6 gets you 8.7TB.

The ideal way to look at things is from the business perspective- Is 
the improved reliability for RAID-10 important enough _for this 
application_ that it is worth the increased cost?  Vet your vendor 
heavily - if necessary, hire someone impartial to come in and explain 
to you exactly what the gotchas are going to be with the products 
you'll be buying.  Then figure out what your exposure is going to be - 
if you run RAID-10, will you be buying another disk array in a year?  
If you run RAID-5 and you lose two disks, how long will it take to 

Again, I'm not defending RAID-F as being as good as RAID-10.   I'm 
simply saying that immediately disregarding RAID-5/F as a waste of time 
based on old information and preconceptions is like disregarding Linux 
based on the way it was back in 1998.  Times change, and keeping costs 
down is something that, imho, not enough technology people think about.

Thanks much,

Matthew Zito
GridApp Systems
Email: mzito@xxxxxxxxxxx
Cell: 646-220-3551
Phone: 212-358-8211 x 359

On Dec 15, 2004, at 8:36 PM, Joel Garry wrote:
>> On Tue, 14 Dec 2004 10:47:20 +0000, chris@xxxxxxxxxxxxxxxxxxxxx=20
>>> My experience is that with either RAID 5 or 10 you have to be=20
>>> unbelievably unlucky to lose data providing disks are replaced 
>>> when=20
>>> they fail and not left for a few days or even more. You are 
>>> talking=20
>>> extremely remote. It might be an idea to get someone to do the 
>>> maths=20
>>> and work out the probabilities.
>> I, for one, have been that unlucky on at least one occasion
> Me too.  No one was listening to the standby machine 350 miles away,
> going <little fly voice> Help me!  Help me!</little fly voice>
> Also, more often, seen what Cary points out, failures happen in 
> clusters
> or dominos.
> Joel Garry=20
> =A0
> --


Other related posts: