RE: High wio on New Hitachi SAN Storage of DB Server

  • From: "Adrian" <ade.turner@xxxxxxxxx>
  • To: "'Mark Brinsmead'" <mark.brinsmead@xxxxxxx>
  • Date: Tue, 15 Nov 2005 19:55:23 -0000

Mark et all (you know who you are),

Thanks very much for the feedback. To summarise...

Recommended notes were
metalink notes: 257338.1, 272520.1, 262851.1

We checked the patch level and remounted the jfs2 filesystems with cio
option (datafile mount points only). Then I set the database parameter
filesystemio_options = SETALL and I've run my tests

It worked out pretty much as documented in the excellent IBM paper - JFS2 +
CIO is much better in terms of throughput and utilization for the database.
On our 2CPU,4GBmem test LPAR, in the same write intensive testing it had the
following effect on CPU (approx figures):

async         |  with cio
Kernel=75%  |  Kernel=2%
WIO=22%     |  WIO=46%
User=3%     |  User=3% 
Idle=0%     |  Idle=49%
RunQ=8      |  RunQ=1
SwpOcc=20   |  SwpOcc=0

No sign of the kernel time, system freezes, with better time to complete etc

Hope this helps someone else; thanks again for the input


-----Original Message-----
From: Mark Brinsmead [mailto:mark.brinsmead@xxxxxxx] 
Sent: 12 November 2005 00:55
To: ade.turner@xxxxxxxxx
Cc: VIVEK_SHARMA@xxxxxxxxxxx; 'Oracle-L'
Subject: Re: High wio on New Hitachi SAN Storage of DB Server

Vivek and Adrian,

   I'm afraid I can't offer any "answers", but I do have a "dumb" 
question that
might be helpful.  (Or not.)

   Are you using Async-I/O, or Concurrent I/O?  How many AIO servers have
you configured?

   If you are truly using Async-I/O against "cooked" JFS2 storage, there 
are a
couple fairly obvious options available to you.

   One would be to try Async I/O with raw partitions instead.  As I recall,
AIX 5.2 (and later) uses a kernel-based AIO driver, as opposed to AIO
servers.  If you see a pronouced performance difference by moving to
raw partitions (which aren't *too* awful with the AIX volume manager)
this would probably suggest that you have misconfigured the AIO servers,
either by not permitting enough of them or perhaps by not allowing them
to queue enough I/Os per (logical) disk...

   Another option, at least for adrian, is to mount your filesystems 
with the
"cio" option, and enable Concurrent I/O in the database, you will be able
to bypass the AIO servers completely.

   I'm afraid I'm out of the office right now, so I can't easily offer more
details.  There are several whitepapers on CIO for Oracle (or databases
in general) on IBM's website.  You'll also find a considerable amount of
discussion (now, maybe a year or two old) about AIX and CIO on

   At a (very) wild guess, it would sound like maybe you (or your sysadmins)
have substantially under-configure one or both of AIO "minservers" and
"maxservers" (I think those are the terms used by SMIT), and/or maybe
need to "lie" to AIX about the number of I/Os that can be queued for
each of your logical disks.  (I've never bothered to tweek the latter;
I've never had to.)

   For the record, my configuration is P610s going against an IBM ESS
disk array (yeah, I know, small potatoes) and at least indirectly P570s
going against IBM ESS disk arrays, and EMC CX-700 disk arrays
(medium-sized potatoes).  I've never seen indications of your kind
of problems with either.

   One other silly question:  Is there any chance that somebody has "goofed"
with the RAID configuration?  70% WIO is *extremely* high -- high enough
to maybe suggest the possibility that somebody has configured hardware
or software RAID-1 between two "logical" disks that are actually located
on the same "physical" disk.  I *know* this one is a long shot, but I *have*
crossed paths with people who have actually done this with software RAID
at least.  Does 'sar' suggest consistently poor performance for all devices?

Adrian wrote:

>Hi Vivek,
>Your note interested me; we have exactly the same configuration and
>but we are still running async_io = TRUE. I concur with your findings on
>extreme performance issues - short term server freezes are occurring. You
>mention a bug; is this hearsay or do you have details?
>If you've turned of async_io (the default on aix), then you may want to use
>DBWR_IO_SLAVES. I've not tried this yet.
>Anyway, ours is a HDS9990 with McData Switches attached to a p690 running
>AIX 5.2 ML4. The filesystems are jfs2.
>I can reproduce the problem just by creating a 10GB datafile, multiple
>doing random i/o can also get the same issue, or indeed when a parallel
>backup is in progress. However concurrent cp's of multiple files do not
>reproduce the issue (hence I believe it is likely an async i/o problem.
>Under load Topas shows high wait for i/o and Sar -q shows that swpocc% =
>and swpocc > 20
>My unix admins are currently looking at the async i/o settings as per
>metalink note 271444.1, but are heavily loaded and this is not prod (yet)
>the urgency is low.
>If you or anyone else has any pointers with this configuration please let
>Kind Regards
>-----Original Message-----
>From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx]
>Sent: 11 November 2005 18:56
>To: Oracle-L
>Subject: High wio on New Hitachi SAN Storage of DB Server
>While doing Application Testing of Hybrid Trans (OLTP mostly though) by
>200 users (approx) on a NEWLY configured HITACHI SAN Storage, on DB
>Server (of AIX) High wait for IO i.e. wio = 70 % till 1400 Hours is
>NOTE - wio reduced to about 10 % gradually from 1400 Hours to 2000
>Average sar Output from morning to 1400 Hours:-
>13:38:00    %usr    %sys    %wio   %idle
>13:43:00       6       5      67      22
>13:48:00      10       6      74      10
>13:53:00      10       5      66      19
>13:58:00       7       5      61      27
>14:03:00       5       5      67      22
>14:08:00       7       5      74      15
>14:13:00       9       6      69      15
>DB Server = 7 CPUs
>OS AIX 5.2
>Oracle 8.1.7
>Hitachi Storage for DB Server = 40 underlying Disks
>Hardware Raid
>Init.ora disk_asynch_io = FALSE (When disk_asynch_io is set TRUE there
>is extreme performance degradation with AIX 5.2 [seems a bug])
>Comments by IBM
>The average number of process waiting for IO to complete is 12. This
>indicates that these processes are waiting for the IO to complete. This
>is the reason why we are seeing an average iowait of 70%.
>The seek rate is 95.72% on the hdsdb9960lv  LV's indicates a high degree
>of random IO, usually caused by the application or a high degree of disk
>STATSPACK report (will provide any other sections as needed)
>DB Name         DB Id    Instance     Inst Num Release     OPS Host
>------------ ----------- ------------ -------- ----------- ---
>ABNPROD         34298189 abnprod             1   NO  findb
>                Snap Id     Snap Time      Sessions
>                ------- ------------------ --------
> Begin Snap:          5 15-Oct-05 13:00:55      352
>   End Snap:          6 15-Oct-05 14:00:37      352
>    Elapsed:                  59.70 (mins)
>Cache Sizes
>           db_block_buffers:     215000          log_buffer:   18874368
>              db_block_size:       8192    shared_pool_size:  754288000
>Load Profile
>~~~~~~~~~~~~                            Per Second       Per Transaction
>                                   ---------------       ---------------
>                  Redo size:            298,013.21                837.31
>              Logical reads:             55,540.47                156.05
>              Block changes:              2,296.74                  6.45
>             Physical reads:              3,109.99                  8.74
>            Physical writes:                399.33                  1.12
>                 User calls:              2,657.16                  7.47
>                     Parses:                 64.98                  0.18
>                Hard parses:                  5.44                  0.02
>                      Sorts:                 75.56                  0.21
>                     Logons:                  0.75                  0.00
>                   Executes:              1,783.45                  5.01
>               Transactions:                355.92
>  % Blocks changed per Read:    4.14    Recursive Call %:   15.67
> Rollback per transaction %:   94.71       Rows per Sort:   10.87
>Top 5 Wait Events
>~~~~~~~~~~~~~~~~~                                             Wait     %
>Event                                               Waits  Time (cs)
>Wt Time
>-------------------------------------------- ------------ ------------
>db file sequential read                         6,230,712    1,640,351
>log file sync                                   1,087,475    1,286,467
>db file scattered read                            351,411      416,508
>log file parallel write                           706,201      288,168
>buffer busy waits                                 334,943       69,830
>          -------------------------------------------------------------
>Qs How might this issue be approached?
>Qs Are there any special O.S. parameters that might be set?
>**************** CAUTION - Disclaimer *****************
>for the use of the addressee(s). If you are not the intended recipient,
>please notify the sender by e-mail and delete the original message.
>you are not to copy, disclose, or distribute this e-mail or its contents to
>any other person and any such actions are unlawful. This e-mail may contain
>viruses. Infosys has taken every reasonable precaution to minimize this
>risk, but is not liable for any damage you may sustain as a result of any
>virus in this e-mail. You should carry out your own virus checks before
>opening the e-mail or attachment. Infosys reserves the right to monitor and
>review the content of all messages sent to or from this e-mail address.
>Messages sent to or from this e-mail address may be stored on the Infosys
>e-mail system.
>***INFOSYS******** End of Disclaimer ********INFOSYS***


Other related posts: