Re: RAC in NAS

  • From: Mladen Gogala <gogala@xxxxxxxxxxxxx>
  • To: Matthew Zito <mzito@xxxxxxxxxxx>
  • Date: Thu, 27 Jul 2006 02:13:30 -0400

On 07/26/2006 11:44:48 PM, Matthew Zito wrote:
> 
> Not to start a holy war on this, but it *IS* possible to have very successful 
> horizontal scaling 
> using RAC.  One of our appliance customers has scaled to a 10-node cluster 
> linearly, saving them 
> over $800,000 just in infrastructure costs, even counting the additional cost 
> of the RAC licenses 
> and our software.  We have several other customers that have scaled to 4-node 
> clusters easily and 
> saved  *only* $100k-200k.
> 
> For sure, RAC is not a magical solution, but there are a number of good 
> reasons to consider it, 
> such as HA, reduced infrastructure cost, and incremental scalability.  
> There's increased complexity 
> surrounding it as well, which is why there's folks like my company and Kevin 
> Closson's, Polyserve, 
> whose value proposition is reducing that complexity.

Matt, I couldn't agree more with you, it is possible to scale system 
horizontally.
This, however, requires careful thinking, preparation and willingness to pay for
services of GridApp or Polyserve. Unfortunately, this is not what I see. What I 
see
is the attitude like this: I have a SUN 6000 or HP 9000/L that is getting old. 
SUN
and HP are expensive, I'll be smart, order 4 dual AMD64 boxes running RH Linux 
and 
NetApp or Clariion storage and will have the same power as one of those new and 
shiny 
HP 9000 boxes or P5 595. To save even more money, company goes with ASM, so it 
isn't
forced to buy an expensive external clustering solution like VCS, SUNCluster, HP
DataGuard or IBM CSM/GPFS. GPFS is, of course, the famous IBM's General 
Protection
Fault System :). Of course, the system is configured without necessary 
redundancy and
using off the shelf components, like GB Ethernet. First thing to notice is that 
I/O 
is typically rather slow and that despite cache fusion, RAC still needs to sync 
some
blocks by writing them to the disk. Second, DLM tends to be rather CPU and 
memory 
intensive and those PC boxes (as opposed to Mac boxes in the running 
commercials) have 
rather limited bus capacity, with very few fancy features like separate cache 
lines,
complex bus arbitration logic, write-back caches on each CPU, fully associative 
TLB,
time signal buffering or memory attached I/O adapters. Precisely the fact that 
those
boxes haven't got all bells and whistles of a SuperDome or P5 595 makes them so 
cheap.
When that happens, one has to cope with bus saturation, storage array hiccups 
and
slowdowns and, last but not least, waiting for the global locks. That type of 
scaling
amounts to self-crucifixion, very popular custom on Philippines, especially 
around
Easter. What people do not understand is that HA solution have never been 
cheap. 
Going cheap on the critical component that is supposed to fully support 
company's
business amounts to shooting yourself in the foot. Twice. Hiring an expert 
company
means that the company is smart and isn't falling for the cheapo stuff, Yugo of 
the
HA solutions. 

-- 
Mladen Gogala
http://www.mgogala.com

--
//www.freelists.org/webpage/oracle-l


Other related posts: