Is my Oracle Server issuing more IO than it can handle

  • From: Oracle Dba Wannabe <oracledbawannabe@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 7 Dec 2010 08:35:46 -0800 (PST)

Hi All, this is a 10.2.0.4 single instance database (non asm). I see the 
following events from awr (1 hour snapshot - however hourly snapshots after 
this 
show the same trend with respect to wait events):


Event
Waits
Time(s)
Avg Wait(ms)
% Total Call Time
Wait Class
free buffer waits 17,926,869 193,146 11 67.7 Configuration 
log file switch (private strand flush incomplete) 41,550 30,538 735 10.7 
Configuration 

log file sync 211,675 25,156 119 8.8 Commit 
buffer busy waits 42,093 23,218 552 8.1 Concurrency 
db file parallel write 376 14,274 37,963 5.0 System I/O 
I know that db file parallel write only contributes to 5% of the total call 
time 
- but its avg wait time looks extremely poor - that and the fact that free 
buffer waits appear at top indicate that there's a db writer issue 
(db_writer_processes=4) - which leads me to believe perhaps its the IO 
subsystem. Now the storage team report there is nothing up with the storage. I 
was hoping someone could help with the following questions:
1. Is there someway from awr that I can determine that the Oracle server is 
issuing more IO than the storage system can handle for example:

Physical reads: 954.74 16.68 
Physical writes: 418.89 7.32 Phy Reads + Phy Writes = 1372 IOPS
Can I then say that if each disk can do 100 IOPS, that the storage system 
should 
at least have 13 Disks? (13x100 IOPS)? Or is that an over simplification?

2. Interestingly this DB server was moved onto a new box with a different 
storage and the issue is no longer observed there.
Transactions Per Second on old box = 57.22
Transactions Per Second on new box = 225
Phy Reads + Phy Writes for New Box, are slightly under half compared to the old 
box:

Physical reads: 243.02 1.08 
Physical writes: 564.62 2.51 
That said, the redo size per second on the new box is twice that of the old box 
(7mb/s : 3mb/s). The buffer cache and db writer processes are the same on both 
boxes.
Other than the storage aspect of things, I'm thinking (and will check) whether 
theres an o.s/kernel misconfiguration w.r.t to async io, etc that might be 
causing this on the old box.
Appreciate any thoughts on 1 or/and 2
Thanks



      

Other related posts: