Re: Oracle RAC on VM

  • From: Markus Michalewicz <markus.michalewicz@xxxxxxxxxx>
  • To: kevin.lidh@xxxxxxxxx
  • Date: Mon, 25 Aug 2014 16:02:05 -0700

Dear all, Kevin,

Regarding "Oracle RAC on VM", many good comments have already been made.

Let me, please, join with the following thoughts and questions considering the original post as well as some comments already made:

To me, using Oracle RAC (or an Oracle Database for that matter) on either physical servers or in a VM environment is an orthogonal question, if you consider VMs to be a different way of providing a server. From the perspective of a DB that runs on the server, the fact that this server may be a virtual server is typically transparent - at least, it should be.

Along these lines of thinking, consider the "virtual server" a server with certain capabilities. That is, where people discuss "RAC" vs "virtualization", whereas in my thinking it should be "RAC and virtualization", depending on what factors are considered and to what degree. Let me make an example, please: A typical VM environment presents each VM as a separate node to the database and assuming that you use a per-node Oracle DB home deployment (whether this is a single instance or RAC does not matter in this case), you still need to consider patching this home and, in the end, the DB. No VM provides a solution in this regard, as VMs do not optimize application-specific operations within the VM (this is part of the transparency paradigm). Of course, certain VM features (such as snapshotting, which other solutions provide, too - see here: http://www.oracle.com/technetwork/database/database-technologies/cloud-storage/oop-patching-of-acfs-shared-oh-1885763.pdf) can be used to work around certain restrictions or improve the situation, which is probably a workaround at best. With Oracle 12c you also get "Rapid Home Provisioning" which provides a more efficient approach to initial deployments and patching, especially in VM or Cloud environments: https://blogs.oracle.com/dbcloudcoverage/entry/rapid_home_provisioning_simplifies_oracle

Coming back to considering the "virtual server" a server with certain capabilities, a lot of points have been made for HA and scalability, in which case "HA" often seems to exclude "maintenance related" downtime (for which I made a case before) and "meeting SLAs". For the latter, VMs do not provide a compelling solutions for databases, either, as most VM solutions are limited to the adhoc capabilities they provide to react to (sudden) changes in demand. I completely understand that demand changes over time can be address by online VM relocation features, which holds true for other solutions, too; e.g. Oracle RAC One Node, which provides Online Database Relocation. Last but not least, any per-node impact (such as memory leaks inherent to the server) would affect the virtual server as much as a physical one in most cases. The difference on whether or not a virtual server or a physical one is used just determines the way how one can react to it - at least at first glance.

As the Oracle Database (single instance or RAC) was designed to run on physical hardware, it includes quite a few features that improve HA as part of its design. In VM environments and for some failures, alternative VM-based features can be used and if set up correctly, those VM-based features can very well be used in conjunction with functionality that the database provides. However, there will be certain requirements on running a database that VMs cannot tackle at present. These requirements will determine which database solutions to use (in addition) as much as they would on a physical machine. Now, that way and unless one optimizes specifically, the Oracle Database (with or without RAC) itself provides most of its value on physical machines (simply by not having to purchase additional solutions). This holds especially true for Oracle RAC, which comes with a stack that provides a lot of "free of charge" features that can even be used outside of RAC, simply with the Oracle Database. More details can be found in this presentation: http://www.slideshare.net/MarkusMichalewicz/oracle-database-12c-with-real-application-clusters-rac-high-availability-ha-best-practices (recording available here: http://www.oracleracsig.org/pls/apex/f?p=105:206:1092883219471001::NO)

If you consider factors such as performance (in general or w.r.t. "servers closer to the storage"), then the same paradigm applies; consider the "virtual server" a server with certain capabilities. In simple terms, there is either an inherent performance penalty in VM environments / on those servers or there isn't. Personally, I think, you can get the best performance on physical environments for many reasons and if you are optimizing for high performance environments, in which time-based performance analysis is part of the design, then I recommend you use physical machines. If, however, the nature of the databases to run in VMs is not as such or there are other strong reasons to use VM, then make sure that you get the best out of those environments. For Oracle RAC on Oracle VM, use this paper (long overdue for an update, I know, ... coming soon): http://www.oracle.com/technetwork/database/options/clustering/oracle-rac-in-oracle-vm-environment-131948.pdf

Looking at Oracle as an example, all Engineered Systems are cluster based (at least 2 nodes) and use a physical deployment model primarily. The Oracle Database Appliance (ODA) also comes with a virtual deployment model as part of the "solutions in a box" idea. Now, the big difference between this solution and a generic VM solution, however, is the "engineered part". One of the challenges I have seen in VM environments is that combining arbitrary hardware and a VM implementation imposes some challenges. You would want to make sure that the combination of hardware and the respective VM solution as well as its deployment architecture has been thought through with the use cases (e.g. whether single instance, RAC, or generic applications will be used within the VMs) in mind. Above linked paper tries to convey this message, as virtualization can never overcome physical limitations (the "weakest link" idea).

This leads me to my concluding statement, rather question: A lot of times I hear people quoting the Oracle support policy for not running an Oracle product in VMware environments. While I understand the concern, I know that customers are running Oracle products (Oracle Databases / RAC) in these environments. I would be very interested to read about cases (SR numbers), where this policy has been used to decline support, as only certain service requests would require re-producing them on physical hardware. Those performance or stability related questions, however, lead to the policy. Oracle can simply not test all combinations of VM solutions and hardware configurations. Hence, Oracle VM is the only certified VM solution (x86) for Oracle RAC, as this eliminates at least one variable here.

Sorry for the long email, but I thought, I would share my thoughts. Hope it helps. Thanks,
      Markus


On 8/22/2014 10:05 PM, FreeLists Mailing List Manager wrote:
oracle-l Digest Fri, 22 Aug 2014        Volume: 11  Issue: 236

In This Issue:
                Re: Oracle RAC on VM
                Re: Oracle RAC on VM
                Re: Oracle RAC on VM
                Re: Oracle RAC on VM
                Re: Oracle RAC on VM
                Re: DCD dead connection detection in 12c
                Re: Oracle RAC on VM
                Re: Oracle RAC on VM
                RE: Oracle RAC on VM
                RE: Oracle RAC on VM
                Calculate additional space needed to reduce percent used on
                Re: Calculate additional space needed to reduce percent used
                Re: Oracle RAC on VM

----------------------------------------------------------------------

Date: Fri, 22 Aug 2014 10:01:24 +0200
From: przemolicc@xxxxxxxxx
Subject: Re: Oracle RAC on VM

Pro VM:- snapshots before any patching, upgrade - very, very handy feature :-)- performance: we migrated some Oracle-s (EBS) to Vmware and nobody 
is complaining (it is not very loaded though ...)RegardsP.Od: "Kevin Lidh" &lt;kevin.lidh@xxxxxxxxx&gt;Do: 
oracle-l@xxxxxxxxxxxxx; Wysłane: 0:16 Piątek 2014-08-22Temat: Oracle RAC on VMAs a DBA, I never wanted to work on Oracle on VMware but it seems 
to be the trend.&nbsp; Now that I’m a manager, I’m looking to propose moving to RAC for HA and also back to physical machines. 
&nbsp;Since this goes against the strategic direction of our organization, I’m sure I’ll be asked why we can’t do RAC on VMs.&nbsp; I 
have my personal opinions about this but I was wondering what the broader audience of experts believe.&nbsp;Factors I’m considering 
are:1)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Servers closer to the storage for performance.&nbsp; In virtualization, you have an 
intermediary processing your requests and responses.2)&nbsp;&nbsp
  ;&nbsp;&nbsp;&nbsp; Access to all resources licensed.&nbsp; We keep a certain percentage of our hosts free to handle 
the load in case one in the cluster fails.&nbsp; With RAC, you have access to all the resources all the time.&nbsp; And since 
you have to pay for it all anyway, I see that as a good thing.3)&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Performance in 
general.&nbsp; I don’t have any evidence but I can’t believe that another layer between my OS calls and the hardware could be as 
fast as not having that layer.&nbsp;Thanks in advance,&nbsp;Kevin

------------------------------

Date: Fri, 22 Aug 2014 04:27:58 -0500
Subject: Re: Oracle RAC on VM
From: Justin Mungal <justin@xxxxxxx>

If you open an SR and Oracle thinks it's the hypervisor, they will tell you
to reproduce the issue in a non-virtualized environment in order to
continue getting support. This has never happened to me, but we don't have
that many virtualized systems running Oracle.

On Thu, Aug 21, 2014 at 5:14 PM, Kevin Lidh <kevin.lidh@xxxxxxxxx> wrote:

As a DBA, I never wanted to work on Oracle on VMware but it seems to be
the trend.  Now that I’m a manager, I’m looking to propose moving to RAC
for HA and also back to physical machines.  Since this goes against the
strategic direction of our organization, I’m sure I’ll be asked why we
can’t do RAC on VMs.  I have my personal opinions about this but I was
wondering what the broader audience of experts believe.



Factors I’m considering are:

1)      Servers closer to the storage for performance.  In
virtualization, you have an intermediary processing your requests and
responses.

2)      Access to all resources licensed.  We keep a certain percentage
of our hosts free to handle the load in case one in the cluster fails.
With RAC, you have access to all the resources all the time.  And since you
have to pay for it all anyway, I see that as a good thing.

3)      Performance in general.  I don’t have any evidence but I can’t
believe that another layer between my OS calls and the hardware could be as
fast as not having that layer.



Thanks in advance,



Kevin


------------------------------

Date: Fri, 22 Aug 2014 08:44:14 -0600
From: Tim Gorman <tim@xxxxxxxxx>
Subject: Re: Oracle RAC on VM

Kevin,
I had to endure debates 20 years ago when some people felt that RAID-0
striping in storage consumed more processing than it provided benefit.
Around the same time, debate about setting the Oracle parameter
TIMED_STATISTICS was rife.  I argued that the additional processing in
capturing timings within the V$ views more than compensated the effort
to do so by getting rid of meaningless ratios and providing the best
tuning metric of all:  time.  Up to the late 1990s, I frequently
encountered production systems with TIMED_STATISTICS=FALSE set by DBAs
who felt they were wringing every last cycle from the database, when in
fact they turned the database into a black box which could not be
optimized scientifically. Before that, in the 1980s, I recall a debate
with a VAX VMS sysadmin who asserted that compiling a program with
certain optimizations enabled consumed more system resources than the
program saved over the program's lifetime.  He also argued that
compiling a program more than 10 times during the course of development
was negative ROI as well, that developers should think long and
carefully before compiling.  Almost 30 years later, I'm still coming up
with retorts that I wish I had thought of back then.

So I believe that the argument that the hypervisor as intermediary
processing sapping performance is likewise missing the most important
points and is disastrously incorrect.  You have to break eggs to make an
omelette.  As long as you're not using any of the 43 remaining Faberge
eggs or those from an endangered species, the omelette is worth it.

There is a misconception that virtual machines exist only in a
free-for-all environment involving a cloud of virtuals on a small number
of overworked physicals, resulting in unpredictable performance and
behavior.  This can be true, and frequently is true.

Virtual machines used in production are often configured with vCPU and
vRAM reservations, so that the resources are reserved for their use only
and are never allocated to other virtual machines.  Setting VM
hyperthreading to "none" is a way to prevent dilution of CPU as well.
And of course, virtual machines can always be configured 1:1 on physical
servers if desired.

Six years ago I was hired by a public transportation agency to assist
them implementing Oracle RAC and DataGuard.  It turned out that none of
their applications consumed more than half the resources of their
servers, so there was no need for the scalability of RAC, and so I
talked them out of that.  Then, with VMware VMotion replicating entire
virtual machines and not just the database (as with DataGuard) leaving
the necessity to replicate file-systems with rsync, it was easy to
convince them to use VMotion for fault-tolerance and high-availability.
As they had only one data center at the time, business continuance and
disaster recovery was not in scope.  This allowed the government agency
to avoid the operate their Oracle databases as straight standalone
non-RAC non-DataGuard instances, applying the KISS principle where it
was needed most.

So, although it was cushy government contract in a downtown location
likely to last for years, I talked myself out of that contract within 6
months.  My prime contractor was surprised and a bit disappointed, and
Oracle sales was angry enough to spit nails, but it was the right course
for them and I'd do the same thing again.

However, sometimes folks can't be saved;  they later bought an Exadata,
so Oracle got their licenses back in the end.  :-)

Hope this helps...

-Tim



On 8/22/14, 2:01, przemolicc@xxxxxxxxx wrote:
Pro VM:
- snapshots before any patching, upgrade - very, very handy feature :-)
- performance: we migrated some Oracle-s (EBS) to Vmware and nobody is
complaining (it is not very loaded though ...)

Regards
P.

Od: "Kevin Lidh" <kevin.lidh@xxxxxxxxx>
Do: oracle-l@xxxxxxxxxxxxx;
Wysłane: 0:16 Piątek 2014-08-22
Temat: Oracle RAC on VM

     As a DBA, I never wanted to work on Oracle on VMware but it seems
     to be the trend.  Now that I’m a manager, I’m looking to propose
     moving to RAC for HA and also back to physical machines.  Since
     this goes against the strategic direction of our organization, I’m
     sure I’ll be asked why we can’t do RAC on VMs.  I have my personal
     opinions about this but I was wondering what the broader audience
     of experts believe.

     Factors I’m considering are:

     1)Servers closer to the storage for performance.  In
     virtualization, you have an intermediary processing your requests
     and responses.

     2)Access to all resources licensed.  We keep a certain percentage
     of our hosts free to handle the load in case one in the cluster
     fails.  With RAC, you have access to all the resources all the
     time.  And since you have to pay for it all anyway, I see that as
     a good thing.

     3)Performance in general.  I don’t have any evidence but I can’t
     believe that another layer between my OS calls and the hardware
     could be as fast as not having that layer.

     Thanks in advance,

     Kevin






------------------------------

Date: Fri, 22 Aug 2014 08:07:49 -0700
Subject: Re: Oracle RAC on VM
From: John Piwowar <jpiwowar@xxxxxxxxx>

I hear this argument often, and when I do, I encourage people to consider:
1) if you open an SR and Oracle thinks it's a hardware or OS problem, they
will likely direct you to the HW/OS vendor. No reason to expect anything
different with hypervisor problems.

2) If *Oracle* is your HW, OS, or hypervisor vendor in a situation where
one of this components of you stack is suspected, you can expect your SR to
be moved to an appropriate group.  It's not "one throat to choke," it's a
hydra. ;-)

I'm a fan of virtualization in principle, but like any
platform/infrastructure decision, it's not a one-size-fits-all solution.
The licensing issue, already discussed in the thread, is a legit concern
with VMWare, but people also find a way to either live with it or work
around it. Performance *could* be an issue, but you really don't know that
until you run some tests with your workloads and can quantify the
differences. Then you (as an organization, not Kevin ;-) get to decide if
those differences are significant enough to impose the operational overhead
of introducing an exception to your strategic direction.  Your best bet is
to be prepared for a data-driven discussion.  :)

This is coming off sounding a bit lecture-y and jerky, and I apologize. I
blame email, lack of coffee, and typing with thumbs. ;) Good luck with the
decision/exploration.

On Friday, August 22, 2014, Justin Mungal <justin@xxxxxxx> wrote:

If you open an SR and Oracle thinks it's the hypervisor, they will tell
you to reproduce the issue in a non-virtualized environment in order to
continue getting support. This has never happened to me, but we don't have
that many virtualized systems running Oracle.


On Thu, Aug 21, 2014 at 5:14 PM, Kevin Lidh <kevin.lidh@xxxxxxxxx
<javascript:_e(%7B%7D,'cvml','kevin.lidh@xxxxxxxxx');>> wrote:

As a DBA, I never wanted to work on Oracle on VMware but it seems to be
the trend.  Now that I’m a manager, I’m looking to propose moving to RAC
for HA and also back to physical machines.  Since this goes against the
strategic direction of our organization, I’m sure I’ll be asked why we
can’t do RAC on VMs.  I have my personal opinions about this but I was
wondering what the broader audience of experts believe.



Factors I’m considering are:

1)      Servers closer to the storage for performance.  In
virtualization, you have an intermediary processing your requests and
responses.

2)      Access to all resources licensed.  We keep a certain percentage
of our hosts free to handle the load in case one in the cluster fails.
With RAC, you have access to all the resources all the time.  And since you
have to pay for it all anyway, I see that as a good thing.

3)      Performance in general.  I don’t have any evidence but I can’t
believe that another layer between my OS calls and the hardware could be as
fast as not having that layer.



Thanks in advance,



Kevin




--
//www.freelists.org/webpage/oracle-l


Other related posts: