RE: Higher CPU Utilisation on failover node under same workload

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: "'Osborne, Chris'" <Chris.Osborne@xxxxxxxxx>, <fuzzy.graybeard@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 7 Nov 2014 05:12:39 -0500

One NULL hypothesis down then.

 

Okay, so the next up would be an asymmetry in the way the storage is mounted
or used by the two different nodes. Or a difference in some parameter
settings that are specified from storage local to the node.

 

I'm thinking something like network pathways to the storage or whether
various OS level system parameters possibly enabling less CPU intensive
reading or something like that.

If some big stuff is being dragged into the buffer cache on one system but
is used with direct read into the pga on the other that could change CPU
utilization.

 

Apart from Oracle, are there any parameter files regarding the storage that
live on local storage on the nodes? Not knowing your storage complex at all
that is a fishing expedition. Since the whole storage is moved I *think* the
difference must be routed either in something like wiring to the storage or
a parameter file stored locally.

 

Since it is the whole storage being remounted that rules out a lot of
possibilities such as parameter file differences in Oracle. Even the double
underbar stuff will be identical because it is identical (unless your
parameter files are on local storage.)

 

JL put up an evolving laundry list of "Nothing changed, so why is it
different" issues on his blog and solicited group source feedback a while
back. That is worth a look. Applying the thought "Why guess when you can
know?" is difficult when the scope of the change is the whole system. Is it
possible to narrow the differential CPU burn to something more specific?

 

Good luck, and thanks for the kind words even though my previous suggestion
was not your solution,

 

mwf

 

From: Osborne, Chris [mailto:Chris.Osborne@xxxxxxxxx] 
Sent: Friday, November 07, 2014 4:11 AM
To: Mark W. Farnham; fuzzy.graybeard@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Higher CPU Utilisation on failover node under same workload

 

HI Mark, 

 

Thanks for that. I had considered that, but when we shut the instance down
on the failover node, and start up again on the other node we do not see the
problem. 

If this was an issue of 'warming up' the DB, we would see it regardless of
which direction we were moving. Additionally, we don't see the issue being
alleviated when the system has been up for a few days. 

As I said it's a two node VCS cluster so when we fail over it's a database
shutdown on one node, take the storage offline on that node, bring the
storage up on the other node and start the instance. This is true regardless
of whether we're going from node a to node b, or vice versa. 

 

I do like your suggestion though for  a 'failover startup kit' in general
though. 

 

Chris

 

 

 

 

Christopher Osborne

Lead Technical Specialist, Performance Engineering

British Sky Broadcasting

Email:chris.osborne@xxxxxxxxx

Desk:  +44 1506 325069  |  Mobile:  +44 7720 308941

Please note new Mobile number. 

 

oebanner4ps_gap2_620

 

From: Mark W. Farnham [mailto:mwf@xxxxxxxx] 
Sent: 06 November 2014 16:49
To: Osborne, Chris; fuzzy.graybeard@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Higher CPU Utilisation on failover node under same workload

 

I would say the NULL hypothesis is that the fail over node never reaches
steady state compared to the normal production workload.

 

As such all the caching of sql, packages, and procedures that takes place in
the shared pool, java in the java pool, and data in the buffer cache is
burning cpu above the normal workload, EVEN if no extra user transaction
load is taking place.

 

A partial cure for this is included in my unwritten (likely never to be
written) book about planning for business continuation.

 

The synopsis relevant to you is: Mine your shared pool for stored procedures
and all the read only queries. (Planning for updating rows using canonical
special values should only be attempted at the post graduate level). Mine
your buffer cache for slowly changing objects. Build yourself a failover
startup kit that runs those procedures as soon as you start the database but
before you turn the users loose. Do all the SYS and SYSTEM owned packages
and procedures first. Remember to do the double (or more) hit and to
implement something to avoid direct read for things you want to stay in the
buffer cache.

 

Hmm. I wonder if I can make a 45 minute presentation out of that.

 

mwf

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Osborne, Chris
Sent: Thursday, November 06, 2014 10:40 AM
To: fuzzy.graybeard@xxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Higher CPU Utilisation on failover node under same workload

 

Immediate. And yes it completes successfully. 

 

Chirs

 

 

 

Christopher Osborne

Lead Technical Specialist, Performance Engineering

British Sky Broadcasting

Email:chris.osborne@xxxxxxxxx

Desk:  +44 1506 325069  |  Mobile:  +44 7720 308941

Please note new Mobile number. 

 

oebanner4ps_gap2_620

 

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Hans Forbrich
Sent: 06 November 2014 14:53
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: Higher CPU Utilisation on failover node under same workload

 

On 06/11/2014 3:13 AM, Osborne, Chris wrote:

When we fail over, it's a database shutdown, 

Just to confirm - what 'kind' of shutdown are you using, and if a 'clean'
one, does it complete?

/Hans

Information in this email including any attachments may be privileged,
confidential and is intended exclusively for the addressee. The views
expressed may not be official policy, but the personal views of the
originator. If you have received it in error, please notify the sender by
return e-mail and delete it from your system. You should not reproduce,
distribute, store, retransmit, use or disclose its contents to anyone.
Please note we reserve the right to monitor all e-mail communication through
our internal and external networks. SKY and the SKY marks are trademarks of
British Sky Broadcasting Group plc and Sky International AG and are used
under licence. British Sky Broadcasting Limited (Registration No. 2906991),
Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers
Services Limited (Registration No. 2340150) are direct or indirect
subsidiaries of British Sky Broadcasting Group plc (Registration No.
2247735). All of the companies mentioned in this paragraph are incorporated
in England and Wales and share the same registered office at Grant Way,
Isleworth, Middlesex TW7 5QD. 

Information in this email including any attachments may be privileged,
confidential and is intended exclusively for the addressee. The views
expressed may not be official policy, but the personal views of the
originator. If you have received it in error, please notify the sender by
return e-mail and delete it from your system. You should not reproduce,
distribute, store, retransmit, use or disclose its contents to anyone.
Please note we reserve the right to monitor all e-mail communication through
our internal and external networks. SKY and the SKY marks are trademarks of
British Sky Broadcasting Group plc and Sky International AG and are used
under licence. British Sky Broadcasting Limited (Registration No. 2906991),
Sky-In-Home Service Limited (Registration No. 2067075) and Sky Subscribers
Services Limited (Registration No. 2340150) are direct or indirect
subsidiaries of British Sky Broadcasting Group plc (Registration No.
2247735). All of the companies mentioned in this paragraph are incorporated
in England and Wales and share the same registered office at Grant Way,
Isleworth, Middlesex TW7 5QD. 

PNG image

Other related posts: