RE: Your experience - positive or negative - with Oracle on VMWare...

  • From: "Matthew Zito" <mzito@xxxxxxxxxxx>
  • To: "Robert Freeman" <robertgfreeman@xxxxxxxxx>, "Allen, Brandon" <Brandon.Allen@xxxxxxxxxxx>, "Oracle-L Freelists" <oracle-l@xxxxxxxxxxxxx>
  • Date: Thu, 21 May 2009 14:09:43 -0400

Well, we do and we don't, but I can weigh in on a few things here.  As
I've said before, we've got hundreds of VMs running on VMWare running
Oracle on Windows 2k3, standalone Oracle on RHEL4, 5, SLES 9,10, and
Oracle RAC on the same platforms.  Our workload is odd, since all we do
is automation, we do a lot of patching, upgrading, migrations, create
RAC cluster, etc.  We don't do a normal OLTP or DW-type workload.

What we see, though, is that while single-instance Oracle is very
stable, Oracle RAC does not cope well with VMWare.  There's a couple of
VMWare papers on this subject, but effectively, you can get clock drift
as part of a normal VMs operation.  There's two ways to keep that in
sync - the vmware-tools package can do it for you, or you can run NTP.

However, under heavy loads on a physical server, either CPU starvation
or maxing out I/O on the machines, your clock on your VM can drift
enough that the CRS freaks out and reboots one of the nodes, or worse,
the vmware toolkit can yank the clock back into sync on one node, and
cause the reboot that way.

As far as I/O throughput being faster, I know some folks at a large bank
that really beat the heck out of all the major VM platforms and
experienced the same increased I/O throughput on a VM vs. physical
machine with naive (non-real-world) workloads.  After much working back
with Linux engineering and the virtualization vendor themselves, they
came to the conclusion that the hypervisor was coalescing hardware
interrupts at high rates much more efficiently than the OS.  When they
switched to actual application workloads, they found a slight
performance hit, but the overall consolidation value made it worth the
decrease in performance, at least for tier-2/3 applications.

VMWare/EMC themselves are pitching VMWare HA as a solution for replacing
RAC or traditional HA solutions for Oracle, even in an unsupported
config.  However, be aware that VMWare HA *only* handles physical
hardware failure, or VM OS crash.  All of the other failure scenarios
that a "real" clustering solution like VCS, MC/ServiceGuard, RAC, etc.
would catch - crashed Oracle instance, down listener, out-of-memory
errors, etc. - will not be caught by VMWare HA.

Thanks,
Matt 

-----Original Message-----
From: Robert Freeman [mailto:robertgfreeman@xxxxxxxxx] 
Sent: Thursday, May 21, 2009 1:51 PM
To: Allen, Brandon; Matthew Zito; Oracle-L Freelists
Subject: Re: Your experience - positive or negative - with Oracle on
VMWare...


Very interesting...

I'd love to see the results of a VM that's got some heavy IO/Concurrency
load. Anyone got something like that out there? :-)

RF


 Robert G. Freeman
Oracle ACE
Author:
OCP: Oracle Database 11g Administrator Certified Professional Study
Guide (Sybex)
Oracle Database 11g New Features (Oracle Press)
Portable DBA: Oracle  (Oracle Press)
Oracle Database 10g New Features (Oracle Press)
Oracle9i RMAN Backup and Recovery (Oracle Press)
Oracle9i New Features (Oracle Press)
Other various titles out of print now...
Blog: http://robertgfreeman.blogspot.com 
The LDS Church is looking for DBA's. You do have to be a Church member
in
good standing. A lot of kind people write me, concerned I may be
breaking
the law by saying you have to be a Church member. It's legal I promise!
:-)
http://pages.sssnet.com/messndal/church/parachurch.pdf



----- Original Message ----
From: "Allen, Brandon" <Brandon.Allen@xxxxxxxxxxx>
To: "robertgfreeman@xxxxxxxxx" <robertgfreeman@xxxxxxxxx>; Matthew Zito
<mzito@xxxxxxxxxxx>; Oracle-L Freelists <oracle-l@xxxxxxxxxxxxx>
Sent: Thursday, May 21, 2009 11:41:56 AM
Subject: RE: Your experience - positive or negative - with Oracle on
VMWare...

Regarding performance, I tested Linux on bare metal vs. Linux on VMWare,
because I was skeptical at first after hearing many horror stories about
I/O performance on VMWare.  I noticed a decrease of up to 10% in the
number of IOPS I'd get when running ORION on VMWare vs. bare metal in
some tests, but in other tests running multiple concurrent executions of
ORION to max out the system the VM would actually outperform the bare
metal.  I didn't have enough time to test as thoroughly as I'd like to
find an explanation for the variations.  I would've liked to try running
the tests on raw devices instead of filesystems to eliminate the caching
effect, but didn't have time to do that and our production system is on
filesystems, so it wasn't really relevant.  I think it's a given that
there will be some CPU and memory overhead for the VM layer, however
it's probably under 5%, and we didn't even bother testing the limits of
CPU since we weren't planning on pushing
 the CPU on the box that hard - it's very rarely a bottleneck with the
speed of CPUs these days.  We're currently running 17 small databases
(about 10GB each) across three Dell 2950s, each with two quad-core Xeons
(X5450@xxxxxxx) & 32GB RAM, fiber attached to a CX3-40 disk array with
RAID 5 groups striped across 5 disks - and performance has been great.
It's not exactly a "large" environment, but there are a total of about
250 concurrent users running an Oracle Forms ERP application and you can
see here from "sar -u" that the boxes are pretty much sitting idle:

08:30:01 AM       CPU     %user     %nice   %system   %iowait    %steal
%idle
08:40:01 AM       all      3.60      0.00      1.64      0.22      0.00
94.54
08:50:01 AM       all      2.71      0.00      1.33      0.37      0.00
95.59
09:00:01 AM       all      3.81      0.01      1.42      0.24      0.00
94.52
09:10:01 AM       all      4.91      0.00      1.60      0.27      0.00
93.21
09:20:01 AM       all      2.67      0.00      1.30      0.16      0.00
95.88
09:30:01 AM       all      3.07      0.00      1.47      0.23      0.00
95.23
09:40:01 AM       all      4.94      0.00      2.12      0.31      0.00
92.63
09:50:01 AM       all      1.64      0.00      1.16      0.12      0.00
97.07
10:00:01 AM       all      1.40      0.01      1.10      0.14      0.00
97.35
10:10:01 AM       all      2.53      0.00      1.64      0.54      0.00
95.29
10:20:01 AM       all      3.13      0.00      1.40      0.34      0.00
95.13
10:30:01 AM       all      2.31      0.00      1.35      0.25      0.00
96.09
Average:          all      2.47      0.00      1.22      0.26      0.00
96.05


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Robert Freeman

I'm interested in performance with respect to IO, memory utilization,
CPU and the like


Privileged/Confidential Information may be contained in this message or
attachments hereto. Please advise immediately if you or your employer do
not consent to Internet email for messages of this kind. Opinions,
conclusions and other information in this message that do not relate to
the official business of this company shall be understood as neither
given nor endorsed by it.

--
//www.freelists.org/webpage/oracle-l


Other related posts: