RE: SQL*Net message waits

  • From: DENNIS WILLIAMS <DWILLIAMS@xxxxxxxxxxxxx>
  • To: "'oracle-l@xxxxxxxxxxxxx'" <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 16 Mar 2004 16:39:23 -0600

Robyn
   I think Cary's book mentions that connection pooling can cause an
additional level of difficulty in relating user actions to Oracle traces. Do
you have a test system which you could get pretty much to yourself and a
user/developer? I would think that you might be able to learn how the
application/connection pool/database works together. Then you could go to
the production server and know what to look for. Just an idea that I've
considered.

Dennis Williams
DBA
Lifetouch, Inc.
dwilliams@xxxxxxxxxxxxx 

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx]On Behalf Of Robyn Anderson Sands
Sent: Tuesday, March 16, 2004 4:24 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: SQL*Net message waits


Hi Mark,

I've asked the users to let me know which processes need to be evaluated
but they haven't been able to tell me thus far. Application speed is an 
issue with some users, but not others.  The only SLA I've been given is
99.98 uptime on a very buggy application (sigh).

I normally use 10046 traces on individual, longer running jobs and I
know this isn't the technically correct way to do this, but the only
question I'm being asked at the moment is 'Is there a communication
problem between the app server and the db?'

Summary of the trace file info looks like this:

call    count   cpu     elapsed disk    query   current rows
Parse   9388    2.29    2.75    0       10      0       0
Execute 21338   14.84   18.42   50      46799   12979   2733
Fetch   18272   56.16   61.66   62591   4929531 4158    217312  
Total   48998   73.29   82.83   62641   4976340 17137   220045

Event waited                    times   max     total
db file scattered read          6792    0.02    2.49
db file sequential read         2537    0.03    5.65
latch free                      47      0.02    0.12
log file sync                   5       0.03    0.06
SQL*Net message from client:    14414   28.86   804.08 *
SQL*Net message to client:      14415   0.01    0.05   *
SQL*Net more data from client:  50      0.01    0.01
SQL*Net more data to client:    4296    0.01    0.23

* These numbers are from pooled connection traces and there is lots of
user think time that would be included. That's my real question: can I use
a 10046 trace to evaluate pooled client connections?  Since the database
appears to be waiting on a client input, can I ignore the 'from client'
waits or could they be hiding other problems? 

No problems have been noted on the database or the os.  The servers are
all sized beyond the vendor specs; client hardware is another story. 

If there's a better way to monitor for connection problems between an app 
server and the database, please let me know ...

Robyn

-----Original Message-----
From: Bobak, Mark 
Sent: Tuesday, March 16, 2004 4:32 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: SQL*Net message waits

Robyn,

Ok, of a 30 minute window, 0.05 seconds total was attributed to SQL*Net
message waits.  How many total SQL*Net waits were in the trace?
I'm a little concerned because with the scoping error you admit is present
in your tracing, it's going to obfuscate the solution a bit.
I'm also a little concerned that "the users can't tell me exactly which
actions they believe are slow....".  That statement tells me
your system has no real measurable SLA.  Without that, how will you know
when you've succeeded?  Ok, ok, I know, it wasn't your idea
but you're stuck with it.  Which business process is the most business
critical and performing the worst.  Start there.  Try to get 
an accurately scoped trace of that process.  This will reveal much in
determining where the problem is.

Now, going back to the trace you already got:
What does the profile of the other 29 minutes 59.95 seconds look like?
How many database calls are in the trace file?
Database calls are PARSE, EXEC and FETCH.  Also, how much CPU time is
represented in that 30 minutes?  How much wait time?
What events were you waiting on?

Also, consider what's going on the in the O/S.  Are the CPUs hammered?
Are there lots of processes waiting on I/O?

Hope that helps,

-Mark


----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------
----------------------------------------------------------------
Please see the official ORACLE-L FAQ: http://www.orafaq.com
----------------------------------------------------------------
To unsubscribe send email to:  oracle-l-request@xxxxxxxxxxxxx
put 'unsubscribe' in the subject line.
--
Archives are at //www.freelists.org/archives/oracle-l/
FAQ is at //www.freelists.org/help/fom-serve/cache/1.html
-----------------------------------------------------------------

Other related posts: