Re: 9i RAC or 10g RAC ?

  • From: "Don Granaman" <granaman@xxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 12 May 2004 13:42:24 -0700

I'll chime in...

Cary, Zhu, and some others are correct.  If you are looking for
"performance", don't use RAC.  The cost of everything , including just
starting the instance, is greater.  If you want basic availability, don't
use RAC.  It is more complex and has more "moving parts" that can go wrong.
Please refer to Mogens' paper "You Probably Don't Need RAC" at
http://www.miracleas.dk (via Writings From Mogens).

My experience with RAC on RedHat As 2.1 (and now ES 3.0) is similar to
Zhu's.  The behavior of 9.0.x was almost comical.  It seemed sometimes that
a single stray neutrino could crash everything even near a cluster
component.  Even with 9.2.0.4 we had a few additional hardware (?) issues
that were "interesting".  One was a driver (for the EMC Clariion) that would
occasionally cause a node to "lose its LUNs".  Even though they were still
accessible from the other node, its instance too would die a horrible death.
Then after reboot, instance recovery would take ~45 minutes - even when
there was little to do.  There would be little or no I/O, very little CPU
utilization, but the instance would sit there for 43 minutes or so, then
realize that it was supposed to be doing recovery, suddenly come to life,
and complete in 70 seconds or so.  Support finally created an (as yet
unpublished) bug on it.  However, after several driver updates (my boss is a
VP, the SA, and was a sr systems engineer at EMC for years - and has a few
connections...), all this silliness finally disappeared, as well as some
other issues (e.g. 12170 two-task layer errors) that were evidently tied to
the driver.  I got to close three long-standing TARs!  [GD: "What a long,
strange trip its been..."]

After six months, multiple OS patches, multiple QLogic driver updates, a few
Oracle patches and workarounds, and some application process partitioning,
the system is now fairly stable (with 9.2.0.4).  Not as stable as on 8.1.7.4
exclusive though.  The interconnect speed, as mentioned previously, is
important, but beware that there are some limiting issues with Linux
(multiple/redundancy/crashworthiness, speed,  etc.) with Linux (RedHat on
Dell at least) compared to more "mature" cluster implementations (e.g Sun
PDB, HP, etc.).  I haven't really had any significant problems with it, but
I don't have a "randomly splatter processes over an array of
nodes/instances" implementation.

What I like best about RAC is that I can often exile ill-behaved processes
(e.g. "What's a bind variable?", LIO pigs, cache-trashers, and their ilk) to
node(s) not running more critical and well-behaved code.  You do get
multiple redo threads and some other things that may help in certain
situations.  To quote one true OPS/RAC expert (name withheld to protect an
Oracle employee from "political incorrectness" charges). "I like RAC, I like
Linux, but I don't like RAC on Linux - at least not yet".

I have no experience with 10g yet, but if you want "cheap", I find the
concept of Oracle One RAC on a large cluster of 2-CPU Linux machines both
intriguing and scary.  Yep - with 10g RAC is available for SE, EE is not
required!  At the serious risk of sounding like a dinosaur/heretic, if you
can intelligently partition your application processes between
nodes/instances, it might work well on a (Oraclely speaking) shoestring
budget.

One thing that has always been the bane of Oracle on Linux is ruinStaller.
The compatibility matrix for Linux is like "where's Waldo?"   If you have
the right version of Linux, the right version of Oracle, the right
glibc/compatibility, the right JRE/JDK ("M-O-U-S-E"), all the particular
environment quirks for the combination (e.g LD_ASSUME_KERNEL, ad nauseum),
and have sacrificed enough chickens, it works - usually.  10g *appears* to
be a vast improvement in this respect.  The 9.2.0.5 patchset even uses it.
Even better, you can get an RPM for a 10g "lite" client install!

General advice:  Get the fastest, baddest CPUs you can (at least for the
2-CPU nodes model) - RAC has additional overhead.  Then there is "krefilld"
(sp? -don't currently have a Linux shell session)...  Stuff the boxes with
memory - the first issue we encountered was memory, even with significantly
more than on the old 8.1.7.4 exclusive server.

-Don Granaman
(verbose) OraSaurus

----- Original Message ----- 
From: "Singh, Ratnesh (GEI, GEFA, Contractor)" <Ratnesh.Singh@xxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Sent: Tuesday, May 11, 2004 5:13 PM
Subject: RE: 9i RAC or 10g RAC ?


> Please advise if i am wrong, but I'm hoping that moving to RAC would =
> help improve performance because :=20
>
> Our Solaris production box, which is 12 cpu 1200 mhz,24 gb ram is cpu =
> bound at peak working hours, and its lease ends in 3 months.
> Management does not want to spend money on a bigger Solaris box. So we =
> decided to purchase new linux boxes.
>
> Since 12+ cpu linux boxes are very expensive, we have decided to go for =
> 4cpu linux boxes featuring 2.2ghz Opteron cpu's on 9i or 10g rac, and a =
> new san with faster disks.
>
> rac is being used primarily because i want to link together these 3 or 4 =
> linux boxes.
> The combination of these factors, faster cpu's , faster disks and bigger =
> ram should provide better performance, i hope ?
>
> any advice is appreciated.=20
>
> thanks & regards
> ratnesh=20
>
>
[...snipped...]
> -----------------------------------------------------------------


----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: