Re: RAC interconnect packet size??

  • From: Riyaj Shamsudeen <riyaj.shamsudeen@xxxxxxxxx>
  • To: Mark.Bobak@xxxxxxxxxxxx
  • Date: Wed, 22 Apr 2009 11:16:13 -0500

AFAIK, LMS process doesn't break the packets. As Greg pointed out, you would
see increased kernel mode CPU usage, if much packet assembly stuff is going
on. CPU usage in network router or port will be higher too.

  LMS processes tend to use higher CPU if Global cache traffic is higher.
Performance of LMS processes are critical for decent GC performance, exactly
why LMS process have RT priority from 10gR2 onwards. Further, to the point,
higher CPU usage can affect LGWR performance, which in turn, has negative
effect on GC performance.

  If CPU usage is mostly from LMS, then I guess, you might have to reduce GC
traffic. Even though Cache fusion is supposed to eliminate much of
application worload partitioning, my experience is that this is still
needed. It isn't as bad olden days, but still there is some amount of
workload partitioning needed.


 Another area is to make sure that there are no gc blocks lost etc..

 Of course, some of these concepts documented here:


http://orainternals.files.wordpress.com/2008/03/battle-of-the-nodes-rac-performance-myths_riyaj_doc.pdf

http://orainternals.files.wordpress.com/2008/03/battle-of-the-nodes-rac-performance-myths_riyaj_ppt.pdf


Cheers

Riyaj Shamsudeen
Principal DBA,
Ora!nternals -  http://www.orainternals.com
Specialists in Performance, Recovery and EBS11i
Blog: http://orainternals.wordpress.com


On Wed, Apr 22, 2009 at 9:04 AM, Bobak, Mark <Mark.Bobak@xxxxxxxxxxxx>wrote:

> Hi Tanel,
>
> I hear what you're saying, but, because the LMS processes were by far the
> biggest CPU hogs, I was thinking that the overhead of breaking down and
> reassembling packets was the primary cause of CPU starvation.
>
> As I said, we're currently in "wait and see" mode, hoping that we've seen
> the last of these events.  Obviously, if I see more CPU starvation, I'll
> have to re-think the root cause.  But, as I mentioned before, enabling jumbo
> frames is the "right" thing to do, and there's really no downside, so....
>
> Anyhow, we'll see what happens.
>
> -Mark
>
> -----Original Message-----
> From: Tanel Poder [mailto:tanel@xxxxxxxxxx]
> Sent: Wednesday, April 22, 2009 7:33 AM
> To: Bobak, Mark; 'Greg Rahn'
> Cc: TESTAJ3@xxxxxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
> Subject: RE: RAC interconnect packet size??
>
> I haven't read the whole thread but if you're troubleshooting high waits
> when having CPU starvation there's one thing to remember. When you have
> serious CPU starvation with large CPU runqueues (long scheduling latency)
> then whatever increased waits you see may just be symptoms of CPU
> starvation
> (as due scheduling latency it takes longer for an Oracle process to get
> onto
> CPU to execute the "wait end" function).
>
> Of course if your LMS processes normally don't use as much CPU you might
> have something there. Otherwise just see what are all these sessions doing
> who try to be on CPU (from ASH or some other form of v$session history) and
> how it differs from a normal situation. Things like whether the exec plan
> of
> the prevalent SQL_ID executed then is the same as usual etc.
>
> Tanel.
>
> > -----Original Message-----
> > From: oracle-l-bounce@xxxxxxxxxxxxx
> > [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Bobak, Mark
> > Sent: 22 April 2009 14:26
> > To: Greg Rahn
> > Cc: TESTAJ3@xxxxxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
> > Subject: RE: RAC interconnect packet size??
> >
> > Hi Greg,
> >
> > I agree.  Allow me to describe what we were seeing:
> >
> >  - CPU spikes w/ run queues going into the 20s, very low or
> > no %wait for I/O, 0% idle
> >  - Looking at V$SESSION_WAIT, lots of waits on gc wait events
> >  - up to four LMS processes, burning CPU like crazy
> >
> > (all this on a three node RAC of DL-585s, 4 dual core CPUs per node)
> >
> > The above seemed to be consistent with a system w/ a busy
> > interconnect and no jumbo frames configured.
> >
> > Only time will tell whether enabling jumbo frames actually
> > solved the problem.
> >
> > One other thing, assuming that all your hardware (all NICs
> > and interconnect switches) supports a jumbo frame
> > configuration, there should really be no downside to enabling them.
> >
> > -Mark
>
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>

Other related posts: