RE: cpu average load

  • From: <Paula_Stankus@xxxxxxxxxxxxxxx>
  • To: <niall.litchfield@xxxxxxxxx>, <cary.millsap@xxxxxxxxxx>
  • Date: Fri, 3 Dec 2004 23:05:28 -0500

Cary,

I am sorry it took awhile for me to answer this question.  I have been
in implementations lately.  I truly apologize.

Cary, what you find simple and implement frequently will take me awhile
to catch on to - I am still reading your book as I can.

The question started when my management asked me to deploy certain big
brother monitors on the system - it appealed to them for various reasons
which is a completely different discussion.

Now we are getting warning messages regarding cpu average load.  I am
thinking of upping the thresholds on these warnings since no one has
complained about performance and frankly getting these messages
interferes with our ability to monitor "true" problems on the system -
but then again users sometimes live with bad performance and the
information never gets passed along - IMHO.

I don't mean to take anyone's time in answering an abstract question.  I
just wanted a general understanding of what this was measuring, what
impact it could have before I went ahead and changed it.  I was looking
for a good place to start to gain a better understanding of what this
measures.  For example, I commonly look at TOP to see how much CPU a
process is using.  It is very easy to tell from that which processes are
consuming the most CPU on the system, how much CPU (approximately) and
for how long.  When I get error messages from OEM regarding CPU I can
run TOP and trace it directly back to a particular process many times.
Then I can proceed with more in-depth tracing.  However, if I am getting
warnings, errors and e-mails about average CPU load then I am not
completely clear what that is measuring.

In my simple mind I think that looking at overall resource utilization
on a box is a good place to start if you are seeing things slowing down
(as a whole) then drilling down from there.  Also, proactively
monitoring system resource utilization on a regular basis if you are
supporting a number of databases operationally has proven useful to me.
That is what these overall monitoring processes are for - just to show
unusual activity.  That is why I was asking - where can I start finding
out what is usual or unusual average CPU load? =20

Cary, when you say:

"The amount of response time that process preemptions are costing your
performance is measured as the amount of response time in an extended
SQL trace file that is not accounted for by the sum of your file's c
values at recursive depth zero, plus the sum of your file's ela values."

Does not seem to answer my question.  Certainly, I shouldn't have to
start by running extending SQL traces on everything running on my system
when these warnings occur.  For example, that might require an extending
SQL trace of multiple OLTP system with 100+ users.  Shouldn't I be able
to discern something from this information at a higher level? =20



-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Niall Litchfield
Sent: Wednesday, December 01, 2004 8:28 AM
To: cary.millsap@xxxxxxxxxx
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re: cpu average load

On Tue, 30 Nov 2004 10:59:11 -0600, Cary Millsap
<cary.millsap@xxxxxxxxxx> wrote:
> I disagree that this advice is difficult to implement in practice, =3D =

> because I implement it in practice frequently.

I disagree with mladen for a somewhat different reason (i.e I don't care
about ease of use here). It seems to me this discussion springs from a
technical question that may or may not be worth answering.
Paula's question was along the lines of 'how can I tell if my server is
being utilized efficiently'. One possible answer to this is 'Who
cares?'. Now if the question is being asked because there is an ongoing
discussion about buying new hardware, or the transactional capacity of
the system is apparently not good enough for the business needs of the
state then there is a real business problem to investigate.

iff there is a real problem to be investigated, then it doesn't really
matter how easy or hard it is to get the correct answer (unless the cost
of obtaining the answer is higher than the cost of not answering), it is
the correct answer that you require.

So I'd be taking a step back and asking Paula to define *why* she is
investigating the amount of the CPU capacity of her machines that Oracle
is using. If you can express that in clear business terms then you can
go down the profiling route (or any other method you think appropriate).


BTW In this particular case, my money would be on unaccounted-for time
being a better measurement of time spent being prempted than the kernel
mode time consumed by the whole system, but I'm willing to be proven
wrong.



--
Niall Litchfield
Oracle DBA
http://www.niall.litchfield.dial.pipex.com
--
//www.freelists.org/webpage/oracle-l


BEGIN-ANTISPAM-VOTING-LINKS
------------------------------------------------------
Teach CanIt if this mail (ID 17285359) is spam:
Spam:
https://dohsmsi01.doh.state.fl.us/canit/b.php?c=3Ds&i=3D17285359&m=3D3143=
90471
7b0
Not spam:
https://dohsmsi01.doh.state.fl.us/canit/b.php?c=3Dn&i=3D17285359&m=3D3143=
90471
7b0
Forget vote:
https://dohsmsi01.doh.state.fl.us/canit/b.php?c=3Df&i=3D17285359&m=3D3143=
90471
7b0
------------------------------------------------------
END-ANTISPAM-VOTING-LINKS

--
//www.freelists.org/webpage/oracle-l

Other related posts: