RE: Higher CPU Utilisation on failover node under same workload

From: Iggy Fernandez <iggy_fernandez@xxxxxxxxxxx>
To: "Chris.Osborne@xxxxxxxxx" <chris.osborne@xxxxxxxxx>, "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
Date: Thu, 6 Nov 2014 07:39:39 -0800
To definitely prove or disprove the hypothesis that the primary and standby 
have the same configurations, you could create RDA collections and compare them 
using the diff option.
Iggy


From: Chris.Osborne@xxxxxxxxx
To: iggy_fernandez@xxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Higher CPU Utilisation on failover node under same workload
Date: Thu, 6 Nov 2014 14:43:11 +0000









HI Iggy,

 
There’s a definite pattern where we only see CPU time > number of cores when we 
are on the other server.

I only included a single awr report for brevity, but I’ve got loads of examples 
going back for a we while, and we only see the issue
 when we are on the failover node. 
We definitely do have large variations in load across the day, but it’s a 
predictable when we’ll be busy and when we are not as the
 application has been here for a while, and the batch (including daytime batch) 
schedule is fairly well understood.

 
As I said, this is a bit of a head scratcher for me, as I can’t see much going 
on differently other than some pieces of CPU bound
 SQL taking longer to execute, and the SYS cpu time being higher. 
 
Regards,

 
Chris
 
 

 
Christopher Osborne
Lead Technical Specialist, Performance Engineering
British Sky Broadcasting
Email:chris.osborne@xxxxxxxxx
Desk:  +44 1506 325069  | 
Mobile:  +44 7720 308941
Please note new Mobile number.

 


 


From: Iggy Fernandez [mailto:iggy_fernandez@xxxxxxxxxxx]


Sent: 06 November 2014 14:32

To: Osborne, Chris; oracle-l@xxxxxxxxxxxxx

Subject: RE: Higher CPU Utilisation on failover node under same workload


 


Hi, Chris,


 


My gut reaction is that you almost certainly have large variations over time on 
your production system too so I am not surprised that there was a significant 
difference when you compared
 one sample from the primary with one sample from the standby (after 
switchover). You can write queries on the AWR tables to print the workload over 
an extended period of time. I would be extremely surprised if you did not see 
equal or greater variation on
 the primary over a period of time.


 


Iggy


 


 

> From: 
Chris.Osborne@xxxxxxxxx

> To: oracle-l@xxxxxxxxxxxxx

> Subject: Higher CPU Utilisation on failover node under same workload

> Date: Wed, 5 Nov 2014 13:20:10 +0000

> 

> Hi all,

> 

> This is my first post.

> 

> I have an ongoing issue where I am seeing much increased CPU utilisation when 
> a database is running on the failover node, compared to when it is running on 
> the primary node.

> When we perform OS patching we fail from one node to the DR site, while the 
> primary site is being patched.

> Both hosts are the same spec and config, and the database is configured 
> identically on both hosts too.

> 

> AWR Diff reports show that the workload is very similar.

> 

> The 2nd period is where we see the problem

> 

> 

> 1st 2nd

> ------------------------------------------------------------------------------------------------
>  
> ------------------------------------------------------------------------------------------------

> Event Wait Class Waits Time(s) Avg Time(ms) %DB time Event Wait Class Waits 
> Time(s) Avg Time(ms) %DB time

> ------------------------------ ------------- ------------ ------------ 
> ------------- ----------- ------------------------------ ------------- 
> ------------ ------------ ------------- -----------

> db file sequential read User I/O 4,178,196 23,392.4 5.6 61.7 CPU time N/A 
> 38,985.7 N/A 59.8

> CPU time N/A 10,138.9 N/A 26.8 db file sequential read User I/O 4,581,489 
> 23,083.5 5.0 35.4

> read by other session User I/O 325,114 1,866.3 5.7 4.9 db file parallel read 
> User I/O 219,007 1,670.0 7.6 2.6

> 

> db file parallel read User I/O 177,766 1,419.6 8.0 3.7 read by other session 
> User I/O 246,088 1,307.7 5.3 2.0

> enq: TX - row lock contention Application 1,220 1,321.2 1083.0 3.5 enq: TX - 
> row lock contention Application 651 618.6 950.2 0.9

> --------------------------------------------------------------------------------------------------------------------

> 

> 

> Host Configuration Comparison

> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

> 1st 2nd Diff %Diff

> ----------------------------------- -------------------- -------------------- 
> -------------------- ---------

> Number of CPUs: 256 256 0 0.0

> Number of CPU Cores: 32 32 0 0.0

> Number of CPU Sockets: 4 4 0 0.0

> Physical Memory: 261632M 261632M 0M 0.0

> Load at Start Snapshot: 32.61 57.98 25.37 77.8

> Load at End Snapshot: 33.13 63.91 30.78 92.9

> %User Time: 6.03 4.95 -1.07 -17.9

> %System Time: 4.82 15.19 10.37 215.1

> 

> %Idle Time: 89.15 79.86 -9.29 -10.4

> %IO Wait Time: 0 0 0 0.0

> Cache Sizes

> 

> I know that we have a problem with the size of the connection pools on this 
> database, and the fact that they are dynamic too concerns me. This issue is 
> being worked on.

> My first thought is that fact that the single block read time is 10% faster 
> could be meaning more of the sessions are runnable at any given point and 
> slowing us down through context switching, but this may be a stretch...

> I am seeing the host reporting more CPU time spent on SYS rather than User 
> time though.

> 

> Any advice/pointers would be gratefully received.

> 

> Cheers

> 

> Chris

> 

> 

> Christopher Osborne

> 

> 

> 

> 

> 

> 

> Information in this email including any attachments may be privileged, 
> confidential and is intended exclusively for the addressee. The views 
> expressed may not be official policy, but the personal views of the 
> originator. If you have received it in error,
 please notify the sender by return e-mail and delete it from your system. You 
should not reproduce, distribute, store, retransmit, use or disclose its 
contents to anyone. Please note we reserve the right to monitor all e-mail 
communication through our internal
 and external networks. SKY and the SKY marks are trademarks of British Sky 
Broadcasting Group plc and Sky International AG and are used under licence. 
British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home 
Service Limited (Registration No.
 2067075) and Sky Subscribers Services Limited (Registration No. 2340150) are 
direct or indirect subsidiaries of British Sky Broadcasting Group plc 
(Registration No. 2247735). All of the companies mentioned in this paragraph 
are incorporated in England and
 Wales and share the same registered office at Grant Way, Isleworth, Middlesex 
TW7 5QD.

> --

> //www.freelists.org/webpage/oracle-l

> 

> 




Information in this email including any attachments may be privileged, 
confidential and is intended exclusively for the addressee. The views expressed 
may not be official policy, but the personal views of the originator. If you 
have received it in error, please
 notify the sender by return e-mail and delete it from your system. You should 
not reproduce, distribute, store, retransmit, use or disclose its contents to 
anyone. Please note we reserve the right to monitor all e-mail communication 
through our internal and
 external networks. SKY and the SKY marks are trademarks of British Sky 
Broadcasting Group plc and Sky International AG and are used under licence. 
British Sky Broadcasting Limited (Registration No. 2906991), Sky-In-Home 
Service Limited (Registration No. 2067075)
 and Sky Subscribers Services Limited (Registration No. 2340150) are direct or 
indirect subsidiaries of British Sky Broadcasting Group plc (Registration No. 
2247735). All of the companies mentioned in this paragraph are incorporated in 
England and Wales and
 share the same registered office at Grant Way, Isleworth, Middlesex TW7 5QD.
References:
- Higher CPU Utilisation on failover node under same workload
  - From: Osborne, Chris
- RE: Higher CPU Utilisation on failover node under same workload
  - From: Iggy Fernandez
- RE: Higher CPU Utilisation on failover node under same workload
  - From: Osborne, Chris
RE: Higher CPU Utilisation on failover node under same workload

Other related posts: