RE: Oracle & pagefaults (solaris)

From: "Johnson, William L (TEIS)" <WLJohnson@xxxxxx>
To: "przemolicc@xxxxxxxxx" <przemolicc@xxxxxxxxx>, "oracle-l@xxxxxxxxxxxxx" <oracle-l@xxxxxxxxxxxxx>
Date: Mon, 13 Aug 2012 10:35:17 -0400
You may want to have a peak at the /tmp directory and make sure someone is not 
dumping large files into the /tmp directory and impacting memory.
Applications (and sometimes Oracle) will place files under /tmp that can have a 
very negative impact on overall memory performance on a Solaris server.

Just a thought...
Bill

-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of przemolicc@xxxxxxxxx
Sent: Monday, August 13, 2012 10:27 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: Oracle & pagefaults (solaris)


Hi all,

we have been facing performance problems on our Solaris 10 Sparc based server 
with several Oracle databases running on it.
It started 2 months ago and gradually is increasing. There is always high level 
of syscalls when we face these problems:

09:20:03      25       8       0      68
09:30:02      22       7       0      70
09:40:03      23       8       0      70
09:50:03      23       7       0      69
10:00:03      22       9       0      69
10:10:03      25      34       0      41
10:20:03      28      28       0      44
10:30:03      27      24       0      49
10:40:02      24       8       0      68
10:50:03      27      19       0      54
11:00:02      31      27       0      41
11:10:21      29      48       0      23
11:20:03      36      56       0       9
11:30:05      32      51       0      18
11:40:03      37      43       0      20
11:50:06      28      59       0      14
12:00:06      32      61       0       7
12:10:03      38      26       0      36
12:20:02      34       9       0      57
12:30:02      35       8       0      58
12:40:03      35       9       0      56

Normal level of syscalls is about 7-10%. But when something is happening it 
increases to over 50%.
Using DTrace I can find that kernel is working in 'pagefault' function:

......
              unix`page_freelist_coalesce+0x894 [***]
              0x1fe0000
              unix`page_get_mnode_freelist+0x398
              unix`page_get_freelist+0x428
              unix`page_alloc_pages+0x110
              genunix`anon_map_getpages+0x348
              genunix`segvn_fault_anonpages+0x32c
              genunix`segvn_fault+0x530
              genunix`as_fault+0x4c8
              unix`pagefault+0x68
              unix`trap+0xd50
              unix`utl0+0x4c
            29062

              unix`cpu_halt+0x10c
              unix`cpu_halt+0x104
              unix`idle+0x128
              unix`thread_start+0x4
           719133

(this is just part of the image - there are much more pagefaults then listed 
above).
So natural is to check on behalf of which application kernel is working in 
pagefaults:

[ High syscalls ]

CPU     ID                    FUNCTION:NAME
 96  85656                       :tick-5sec
  nscd                                                              1
  top                                                               1
  dtrace                                                            5
  perl                                                            122
  emagent                                                         124
  tnslsnr                                                        1569
  oracle                                                         5538

[ Low (normal) syscalls ]

CPU     ID                    FUNCTION:NAME
 20  85656                       :tick-5sec
  nscd                                                              1
  dtrace                                                            5
  emagent                                                          41
  sh                                                               91
  perl                                                             93
  emdctl                                                          243
  tnslsnr                                                         320
  oracle                                                         2466


When I watch which Oracle's PID are generating these high syscalls using 'ps'
most of them don't exists (!). Are they short lived processes ?
Our DBA claims that this is not a problem of Oracle but DTrace shows that
system is working on behalf of Oracle. Can you shed some light where our DBAs 
should
look for any solution ? When I analyzed number of connections in listener log 
they don't
differ much between low and high system calls.

Can you help me please ? I don't feel comfortable because my feeling is that I 
am doing
DBAs homework but they claim that AWR report is the same as before so there is 
nothing
they could do.

Best regards
Przemek
--
//www.freelists.org/webpage/oracle-l




--
//www.freelists.org/webpage/oracle-l
References:
- Oracle & pagefaults (solaris)
  - From: przemolicc
RE: Oracle & pagefaults (solaris)

Other related posts: