RE: oracle database memory access becomes slow after restart

  • From: "Mark W. Farnham" <mwf@xxxxxxxx>
  • To: <gajav@xxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Sat, 18 Aug 2012 08:02:59 -0400

What Gaja said, AND,

You've neatly boxed the problem to "what is different between restarting the
server and the database instance and just restarting the database instance."
Without examining the system I cannot be comprehensive, but a couple quick
questions come to mind:

Q1: Have you checked whether there are any oracle processes still attached
to the previous shared memory that are still being cleaned up?

Run a ps and specifically check. Now, having a bunch of zombies around is
symptomatic of something else gone wrong while the instance was active, and
I won't speculate on that. But if ps shows oracle processes for the instance
that is already shut down, I suggest you zap them before you restart, since
they will interfere with memory re-allocation and that nice contiguous space
that would nicely service the restart *might* have to be cobbled together
from hither and yon where a lot of other processes are running.

Further, what else is running on this box? Are there things that are *not*
being stopped that have chunks (and they can be little, spread out chunks
from one or more processes and still make allocation of big chunks
difficult) of memory allocated.

Q2: What is the timing on a second x$kslei query on the restart? If a repeat
is fast (and presuming no result caching could apply to this query) then we
are dealing with a start-up artifact, while if a repeat is also slow we are
dealing with something else. Turning on 10046 as Gaja suggested should be
very informative, especially differences between the two runs if they are
very different.


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
On Behalf Of Gaja Krishna Vaidyanatha
Sent: Friday, August 17, 2012 8:53 PM
To: oracle-l@xxxxxxxxxxxxx
Subject: Re: oracle database memory access becomes slow after restart

Hi Eagle,
Not sure whether you got any responses on this. Some questions for you:

1) Have you traced the Oracle session that queries x$kslei using 10046 and
found out what the session is waiting for?
2) Do you see the same problem for all queries? How about something simple
like - "select sysdate from dual;"?
3) Have you DTraced the server process for the Oracle session and found
anything interesting in the output?

Let us know.


Gaja Krishna Vaidyanatha,
CEO & Founder, DBPerfMan LLC

Phone - +1-650-743-6060
LinkedIn -

Co-author: Oracle Insights:Tales of the Oak Table -
Primary Author: Oracle Performance Tuning 101 - Enabling Cloud Deployment & Management for
Oracle Databases

 From: Eagle Fan <eagle.f@xxxxxxxxx>
To: oracle-l@xxxxxxxxxxxxx
Sent: Monday, August 13, 2012 8:16 PM
Subject: oracle database memory access becomes slow after restart
We are seeing this problem several times on Sun T3-1 servers. We also see
the problem 1~2 times on Sun T4-1 Server

The server has 128G memory, database version is, or SGA is set as about 105G.

After database restart, the access to memory becomes slow. It doesn't always
happen, usually it happens on the servers which are up for a long time.

For example: version:

Before restart:

select count(*) from x$kslei;


*Elapsed: 00:00:06.57*

After restart:
select count(*) from x$kslei;


*Elapsed: 00:00:43.33*

And we also see mutex, latch problem on the databases. I think that's the
result of the slow memory access.

If we reboot the server and then restart the database, it's back to normal.

A possible reason is memory fragmentation. Here is the explanation from
oracle support:

*S**ince T3 has page size of 4 MB and Solaris kernel has single thread free
memory coalescing thread, it takes 15-20 minutes to coalesce the free memory
to create large contiguous free memory chunk after we shutdown the Oracle
database. Since we immediately start bringing up the database before the
free memory is coalesced, the next shared memory segment allocation is
fragmented and thus causes more memory latency compared to 1 single large
memory chunk. *

The granule size is set as 128M in our database. The current solution is we
do database failover instead of database restart. But it's more complicated
and we need to make sure the inactive node is just rebooted before failover.
Includes the inactivate nodes restart time, it takes much more time than
database restart.

Do you have the same problem on T3 server? How do you deal with it?

Is there any way to check the OS memory fragmentation status?


Eagle Fan




Other related posts: