RE: DBWR - How Many is Too Many?

High IOWait may be caused if kernel parameter are not set optimally.
 
Just quoting from an old reference
"The one I find helpful is the "ps -elf" which shows the status of the
processes.  There were a number of suggestions to perform a truss to see
what's going on with that process - as seen in the "ps -elf" command.  A
number of people suggested using the lsof command.  In my particular
situation this was not as helpful since there were a number of database
running in the busy partition - lsof showed me all of the big datafiles
opened by the five database instances.

I noted that the developers were creating databases on their own.  The
/etc/system parameters were for two databases so I increased the number of
semaphores based on the Oracle guidelines.  This appears to have solved the
problem and alleviated what appears to be locking."

 
------------------------------------------------------------
PB Singh
DW Architect and Sr Data Modeler
VMware
 
 

________________________________

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On
Behalf Of David Barbour
Sent: Wednesday, February 27, 2008 2:27 PM
To: Greg Rahn
Cc: oracle-l@xxxxxxxxxxxxx
Subject: Re: DBWR - How Many is Too Many?


Good point(s).  And 'free buffer waits' is NOT in my top 5.


On 2/27/08, Greg Rahn <greg@xxxxxxxxxxxxxxxxxx> wrote: 

        My first question to you (and others who increase the number of dbwr
processes): Are you seeing 'free buffer waits' in your 'Top 5' in the
Statspack report?  If not, it is unlikely that adding more dbwr process will
yield any benefit.
        
        If you are experiencing high wait IO (>40% is high), then what
benefit is there to potentially do more IO, by adding dbwrs?  I would think
that would make things worse.  You need to get to the root cause of why the
WIO is high.  My suggestion is start by investigating the iostat data (what
are the IO response times, lun queue depth, etc).  It sounds quite likely
that if you even use a simple file create test, the IO would be poor.  I
would use /usr/sbin/lmktemp and do some lower level testing.
        
        The other observation I would make is that if you changed the SAN,
and not the database, and it worked ok before, then in all likelihood it is
not a database problem.  No?
        
        
        
        On 2/27/08, David Barbour <david.barbour1@xxxxxxxxx> wrote: 

                sar is scary (just a small portion)
                
                00:00:00    %usr    %sys    %wio   %idle   physc
                02:15:01      19      19      42      19    4.00
                02:20:00      21      25      40      14    4.00
                02:25:00      19      18      43      20    4.00
                02:30:00      18      18      43      21    4.00
                02:35:00      20      24      40      16    4.00
                
                We're running JFS2 filesystems with CIO enabled, 128k element
size on the SAN and AIO Servers are set at minservers = 220 and maxservers =
440
                We've got 32GB of RAM on the server and 4 CPUs (which are
dual core for all intents and purposes - they show up as eight). 


                I'm thinking the data block waits is the result of  too many
modified blocks in the buffer cache.  Solution would be to increase the
number of db_writer_processes, but we've already got 4.  Metalink, manuals,
training guides, Google, etc.  seem to suggest two answers.  
                
                1.  One db writer for each database disk - in our case that
would be 8
                2.  CPUs/8 adjusted for multiples of CPU groups - in our case
that would be 4
                
                Any thoughts?
                
                

        
        
        
        -- 
        Regards,
        
        Greg Rahn
        http://structureddata.org 


Other related posts: