Re: DB RAID Setup

From: Tim Gorman <tim@xxxxxxxxx>
To: martinfbrown@xxxxxxxxxxx
Date: Tue, 09 Dec 2008 15:49:50 -0700

Martin,

The event "log file sync" is usually caused by two things:

LGWR is operating synchonously upon COMMIT (i.e. LGWR too slow, probably waiting excessively on event "log file parallel write")
LGWR is being continuously interrupted by too-frequent COMMITs

The first is the situation where you want faster disk or even SSD. The second is an application issue that you'll have to take up with the folks who designed, built, or use the application supported by the database.

With regards to the potential cause being slow I/O under the online redo log files...

Your "architect" has multiplexed the online redo log files to RAID1 volumes as well as RAID5 volumes. Clearly, the lowest common denominator is at work here -- that is, the performance of writes to the online redo logfiles is determined by the slowest write. So, in multiplexing between a faster RAID1 volume and a slower RAID5 volume, the overall write performance will be constrained by the slower RAID5 volume.

This is somewhat nonsensical -- why bother basing any online redo log files on RAID1 at all? The reason for RAID1 mirroring is redundancy without loss of read/write performance, and the reason for RAID5 is redundancy to maximize usable space but at a cost to write performance. Space is being wasted on the RAID1 volume because the better performance it provides is not being used. You might as well have two RAID5 volumes for multiplexing those online redo log files; it wouldn't hurt performance any further, and you'd recover some usable disk space, if you needed it.

My advice is: first determine if the event "log file parallel write" comprises a significant amount of total "DB time" (i.e. 10% or more)? Is it in the "Top 5 Timed Events" list in either STATSPACK or AWR? Because if I/O performance isn't causing this problem, then there is no sense discussing the I/O subsystem further. Instead, get to know the application code through examination of SQL traces and V$SESSION_EVENT, then approach the application developers/architects/vendors with specific findings about too-small transactions or excessive commit commands.

If "log file parallel write" is significant however, then I'd suggest to cease putting online redo log files to the RAID5 volume, as a test. You can issue ALTER DATABASE statements to drop those log file members on the RAID5 volume, or you can add additional log file members on the RAID1 volumes and then drop the log file members on RAID5. Either way, these actions can be performed while the database is up, fully available, and nobody will know (or need to know) that the change is occurring. Manipulating online redo log files is definitely an ONLINE activity...

That's just my opinion, your mileage may vary...

Tim Gorman
consultant - Evergreen Database Technologies, Inc.
P.O. Box 630791, Highlands Ranch CO  80163-0791
website   = http://www.EvDBT.com/
email     = Tim@xxxxxxxxx
mobile    = +1-303-885-4526
fax       = +1-303-484-3608
Yahoo IM  = tim_evdbt

Martin Brown wrote:

I have a similar question regarding raid 5 and what I perceive to be a "potential" problem. I am experiencing a high number of logfile sync waits that appear most afternoons as the daily load starts to increase. We're running 10.2.0.3 RAC with a 8 node cluster. We use a 3par storage system for the database. My architect has configured both raid 1 and raid 5 diskgroups and he duplexes the online redo logs and controlfiles to raid5.

Now, he insists that the logfile sync waits are not related in any way to the raid5 configuration and the 3par storage solution is state-of-the-art. But the only remedy I can find is to move to faster SSD. Is it possible that the high afternoon activity coupled with the raid5 configuration is the root of my problem?

I've looked at other stuff too. The log switches are 2 - 3 an hour. The log buffer is never full. Anyone have any other suggestions as to where to look?

Subject: DB RAID Setup
Date: Tue, 18 Nov 2008 17:06:20 +0000
From: Rob.Dempsey@xxxxxxxxxx
To: oracle-l@xxxxxxxxxxxxx

Hi

Oracle 10g2 (data warehouse)

I thought I would ask peoples’ thoughts on the following.

I have setup our database whereby the index tablespace and data tablespace are separate. This is not for performance reason only for ease of maintenance.

We are being advised by the SAN provider to use the following RAID layout

                Archive redo Logs            - RAID 10

Redo Logs                          - RAID 10

Temp tablespace             - RAID 10

Undo tablespace              - RAID 10

Index tablespaces           - RAID 10

System tablespace          - RAID 5

Data tablespaces              - RAID 5

Redo logs / Temp tablespace I agree with.

To use RAID 5 for data, I understand there is a write performance hit but this is a data warehouse so should be ok (Ideally I would like that RAID 10 as well). But to have the index tablespace on RAID 10 and data tablespace on RAID 5 I found that strange. When I asked the reason why I was give the response ‘that is what Oracle recommends’.

Has anyone heard this before?

Rob

You live life online. So we put Windows on the web. Learn more about Windows Live

-- //www.freelists.org/webpage/oracle-l

Follow-Ups:
- RE: DB RAID Setup
  - From: Baumgartel, Paul

References:
- RE: DB RAID Setup
  - From: Martin Brown

Re: DB RAID Setup

Other related posts: