Re: RAID Reliability Calculations

From: Paul Drake <bdbafh@xxxxxxxxx>
To: joelgarry@xxxxxxxxxxxxxxx
Date: Fri, 7 Jan 2005 13:41:36 -0500

Joel,

solar flares.
seriously.
During one week of solar flares, 2 large client sites lost their air
conditioning systems (that were not fault tolerant). This was the only
time in 8 years that I ever heard of such a thing occuring.
Both sites lost hard drives due to excessive heat.
One site continued to lose hard drives at a higher rate than normal,
that likely sustained damage that was not fatal at the time, but was
serious hit points.

fear solar flares.
have redundant cooling units in place.
monitor server room conditions and have alarms sent to pagers.
have systems set to shutdown if temperatures rise above thresholds.

Paul

On Fri, 7 Jan 2005 10:19:52 -0800, Joel Garry <joelgarry@xxxxxxxxxxxxxxx> wrote:
> chris@xxxxxxxxxxxxxxxxxxxxx wrote:
> 
> > 10 TB of data, 72 GB drives with 1,400,000 hrs MTTF
> 
> First, thanks for doing this, I think such exercises are useful and
> informative.
> 
> But just a small MTTF rant:
> 
> I think the manufacturers' MTTF figures are statistically abused.  I
> think they throw out some failures that happen, and have unrealistic
> physical testing environments compared to the real world.
> 
> By the calculations, I should maybe have seen one multiple disk failure
> in my career, yet I sometimes see several per year.
> 
> If anyone can shed some light on these notions, feel free.
> 
> Joel Garry
> http://www.garry.to
> 
> --
> //www.freelists.org/webpage/oracle-l
>
--
//www.freelists.org/webpage/oracle-l

References:
- Re: RAID Reliability Calculations
  - From: Joel Garry

Re: RAID Reliability Calculations

Other related posts: