RE: RAC interconnect packet size??

  • From: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxx>
  • To: Tanel Poder <tanel@xxxxxxxxxx>, 'Greg Rahn' <greg@xxxxxxxxxxxxxxxxxx>
  • Date: Wed, 22 Apr 2009 10:04:50 -0400

Hi Tanel,

I hear what you're saying, but, because the LMS processes were by far the 
biggest CPU hogs, I was thinking that the overhead of breaking down and 
reassembling packets was the primary cause of CPU starvation.  

As I said, we're currently in "wait and see" mode, hoping that we've seen the 
last of these events.  Obviously, if I see more CPU starvation, I'll have to 
re-think the root cause.  But, as I mentioned before, enabling jumbo frames is 
the "right" thing to do, and there's really no downside, so....

Anyhow, we'll see what happens.

-Mark

-----Original Message-----
From: Tanel Poder [mailto:tanel@xxxxxxxxxx] 
Sent: Wednesday, April 22, 2009 7:33 AM
To: Bobak, Mark; 'Greg Rahn'
Cc: TESTAJ3@xxxxxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: RAC interconnect packet size??

I haven't read the whole thread but if you're troubleshooting high waits
when having CPU starvation there's one thing to remember. When you have
serious CPU starvation with large CPU runqueues (long scheduling latency)
then whatever increased waits you see may just be symptoms of CPU starvation
(as due scheduling latency it takes longer for an Oracle process to get onto
CPU to execute the "wait end" function). 

Of course if your LMS processes normally don't use as much CPU you might
have something there. Otherwise just see what are all these sessions doing
who try to be on CPU (from ASH or some other form of v$session history) and
how it differs from a normal situation. Things like whether the exec plan of
the prevalent SQL_ID executed then is the same as usual etc.

Tanel.

> -----Original Message-----
> From: oracle-l-bounce@xxxxxxxxxxxxx 
> [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Bobak, Mark
> Sent: 22 April 2009 14:26
> To: Greg Rahn
> Cc: TESTAJ3@xxxxxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
> Subject: RE: RAC interconnect packet size??
> 
> Hi Greg,
> 
> I agree.  Allow me to describe what we were seeing:
> 
>  - CPU spikes w/ run queues going into the 20s, very low or 
> no %wait for I/O, 0% idle
>  - Looking at V$SESSION_WAIT, lots of waits on gc wait events
>  - up to four LMS processes, burning CPU like crazy
> 
> (all this on a three node RAC of DL-585s, 4 dual core CPUs per node)
> 
> The above seemed to be consistent with a system w/ a busy 
> interconnect and no jumbo frames configured.
> 
> Only time will tell whether enabling jumbo frames actually 
> solved the problem.
> 
> One other thing, assuming that all your hardware (all NICs 
> and interconnect switches) supports a jumbo frame 
> configuration, there should really be no downside to enabling them.
> 
> -Mark



--
//www.freelists.org/webpage/oracle-l


Other related posts: