RE: Solaris 10 shmmax

  • From: "Kevin Closson" <kevinc@xxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 27 Mar 2007 09:35:07 -0700

At the time, we considered it best practice to max out the Solaris
shmmax setting for multiple reasons:

- to avoid reboots when up-sizing SGAs
- to reduce frivolous SGA fragmentation
- to dispel users of the notion that it was *ever* a really useful knob 
to begin with

...I know we are discussing an old document, but you never said it was
wrong and the document clearly states on page 10:

"For optimal Oracle performance, the SGA must be allocated in a single
shared memory segment as ISM. If the SGA cannot be built this way,
Oracle will use multiple segments, and will settle for non-ISM
allocations" the list of 3 reasons you provide here is either missing mention
of the ISM angle, or the document pushed the ISM concern erroneously.
Regardless of how old the document is it has to be one or the other.

The notion that having a monolithic SGA made a big performance impact 
was never actually characterized, AFAIK. I suspect it was only ever 
measurably important when something else was wrong - like a 
pathologically high rate of client process creation and exiting

...unless the document is wrong, it must have been characterized as
making a performance impact at some point in time.

In short, Oracle 10G uses Solaris MPO APIs to 
split its SGA purposefully across lgroups on NUMA systems. It also 
dedicates one DBWR per lgroup, with each exploiting local memory access 
for efficiency. 

...Please explain. DBWR does not actually peek into SGA buffers (unless
block checking is turned on) so how can pairing DBWR with buffers on the
same memory hierarchy exploit anything? And as far as I/O adaptors go
there should be fair access to all from any CPU in these architectures,
no? Making DBWR NUMA aware is not a very large step. Making foreground
processes NUMA aware is where the big gains come in. After all,
foregrounds actually use buffer contents so having them prefer local
memory for their buffers speeds up post-I/O accesses, block cloning
copies, sort spill and so on...

...On the other hand, the LRU structures need to be co-located in the
same memory hierarchy as the buffers for there to be significant
efficiency gained. The LRU structures are in the variable SGA component
so that would have to be chopped up and allocated one per memory
hierarchy as well for NUMA placement. That's been done before. The
things that would make this all nice and clean would be a shmget/shmat
variant that places regions of an IPC shared memory segment onto memory
hierarchy as per a description mask. That way it *looks* like a single
segment to the admin but *feels* like a NUMA optimized segment to
Oracle...I described this back in 1998 in brief here:

but for others to use the concept:

...Don't get me wrong, I'm glad to see 10g on Solaris at least taking
steps to exploit NUMA, but there is nothing new about the concept since
it was done in Oracle8 on Sequent and (IIR) 8i on SGI, DEC and DG to
varying degrees.

Other related posts: