RE: Database sloowww...trying to figure out why...

From: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxxxxx>
To: <oracle-l@xxxxxxxxxxxxx>
Date: Tue, 30 Mar 2004 16:37:53 -0500
Chris,
 
What you need to do is to trace the users' sessions.  In particular, the 
sessions of the most critical business processes.  Without the ability to do 
that, I really think you're shooting in the dark.  Tracing your own session, 
which may or may not have any impact on the business critical processes, is 
futile.
 
If you absolutely cannot trace the user sessions, (why can't you, by the way?), 
then try and take some statspack snapshots.  Try to take small (10-15 minute) 
snapshots when the worst performance problem is occurring.  Also, if you have 
access to do so, looking at V$SESSION_WAIT for the users' sessions may help you 
glean something as to what's happening.
 
Hope that helps,
 
-Mark

-----Original Message-----
From: Chris Stephens [mailto:ChrisStephens@xxxxxxxxxx]
Sent: Tuesday, March 30, 2004 4:24 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Database sloowww...trying to figure out why...



Well, what can I say...more performance problems in our apps environment.  The 
outsourcing company is suggesting things like increasing buffer cache 
(seriously!).

 

This time I've successfully campaigned to alteast get perfstat installed on the 
prod db.  Here is what I've found so far.

 

After taking several snapshots it appears we have a cpu issue (sort of...it's 
weird because the slowaris box doesn't look stressed for cpa AT ALL!).  As much 
as 80% of the response time is due to cpu usage.  ...of that cpu usage the 
majority of it is 'other cpu'.   Absolutely everything I try to do in there is 
slow (mostly look ups on the data dictionary to try and figure out what's 
wrong).  Without the ability to trace user sessions, I thought I would trace my 
own.  Executing statspack.snap takes as much as 25 seconds!!!  So I enable 
10046 trace and formatted the trace file through tkprof.  The database is 9203. 
 The biggest waits in the trace file are due to the execute phase.  Here is an 
example:

 

INSERT into stats$sql_summary

            ( snap_id

            , dbid

            , instance_number

            , text_subset

            , sharable_mem

            , sorts

            , module

            , loaded_versions

            , fetches

            , executions

            , loads

            , invalidations

            , parse_calls

            , disk_reads

            , buffer_gets

            , rows_processed

            , command_type

            , address

            , hash_value

            , version_count

            , cpu_time

            , elapsed_time

            , outline_sid

            , outline_category

            , child_latch

            )

       select :b9

            , :b8

            , :b7

            , substrb(sql_text,1,31)

            , sharable_mem

            , sorts

            , module

            , loaded_versions

            , fetches

            , executions

            , loads

            , invalidations

            , parse_calls

            , disk_reads

            , buffer_gets

            , rows_processed

            , command_type

            , address

            , hash_value

            , version_count

            , cpu_time

            , elapsed_time

            , outline_sid

            , outline_category

            , child_latch

         from stats$v$sqlxs

        where is_obsolete = 'N'

          and (   buffer_gets   > :b6

               or disk_reads    > :b5

               or parse_calls   > :b4

               or executions    > :b3

               or sharable_mem  > :b2

               or version_count > :b1

              )

 

call     count       cpu    elapsed       disk      query    current        rows

------- ------  -------- ---------- ---------- ---------- ----------  ----------

Parse       1      0.07       0.07          0          0          0           0

Execute    1      5.27       5.27          0        398       2738        6041

Fetch        0      0.00       0.00          0          0          0           0

------- ------  -------- ---------- ---------- ---------- ----------  ----------

total        2      5.34       5.35          0        398       2738        6041

 

Misses in library cache during parse: 1

Optimizer goal: CHOOSE

Parsing user id: 266     (recursive depth: 1)

 

Rows     Row Source Operation

-------  ---------------------------------------------------

   6041  VIEW  (cr=0 r=0 w=0 time=4815086 us)

   6041   FILTER  (cr=0 r=0 w=0 time=4805641 us)

  30284    SORT GROUP BY (cr=0 r=0 w=0 time=4760583 us)

  33809     FIXED TABLE FULL X$KGLCURSOR (cr=0 r=0 w=0 time=3106069 us)

 

 

Elapsed times include waiting on following events:

  Event waited on                             Times   Max. Wait  Total Waited

  ----------------------------------------   Waited  ----------  ------------

  control file sequential read              24        0.00          0.00

  db file sequential read                      4        0.00          0.00

  async disk IO                                  2        0.00          0.01

  db file single write                            2        0.00          0.00

  control file parallel write                    4        0.00          0.00

  rdbms ipc reply                                2        0.00          0.00

 

 

I'm not so good at looking at raw trace files so I'm not sure what part of that 
is relevant and what's not.  

 

I guess what I'm asking is that now that I know that much of our database time 
is spent on cpu and it appears (at least in my case) that much of that is the 
execute phase where should I go from hear?  ...I would imagine there's a 
plethora of info in the raw trace file that I'm not privi to due to ignorance.  
...in my defense, I did try getting the miracle 10046 doohickey to work but 
have given up because I'm on hp not red hat and I can't figure out what the 
equivalent of  'ls -l --time=atime *' is in ksh.  ...and then it's not 
guaranteed to work because of possible differences in trace file format between 
red hat and hp.  

 

If anyone would like to see the full trace file, I have attached that.

 

Thanks for any help.

 

Anyone know of a 10046 parser that works in an hp environment?

 

Chris
RE: Database sloowww...trying to figure out why...

Other related posts: