RE: Oracle RAC backup hardware/software recommendation?

  • From: "Marquez, Chris" <cmarquez@xxxxxxxxxxxxxxxx>
  • To: "David" <thump@xxxxxxxxxxxxxxxx>, <mueller_m@xxxxxxxxxxxxx>, <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 4 Oct 2005 14:34:13 -0400

>> -----Original Message-----
>> From: David [mailto:thump@xxxxxxxxxxxxxxxx] 
>> Sent: Tuesday, October 04, 2005 12:21 AM
>> Subject: RE: Oracle RAC backup hardware/software recommendation?
>> 
>> Just as a counter opinion we have many db's with archive on 
>> ocfs and others with all files on ocfs with no issue whatsoever 

Fair enough, but just to pile on what I have been saying (complaining
about).
I have included two of my previous post regarding OCFS.
I include them not to make a point that one can not successfully use
OCFS, I do everyday...but to pass along about the large amount of time I
personally have spent on OCFS and OCFS related issues.
My time is valuable (to me) and uncovering things I have not been told
or could not have read is no fun...a waste of my time.

Also, I will again say "you get what you pay for"...and we paid noting
for OCFS, so I'm not expecting any sympathy.

Finally. it seems in these old threads I am again saying the Oracle (at
one time) did not recommend using OCFS for arch logs!?
I can *NOT* validate these comments, but seeing that I wrote in the past
leads me to believe I have see it, in black and white, somewhere?
I will keep looking...

hth

Chris Marquez
Oracle DBA


------------------------------------------------------------------------
--------
------------------------------------------------------------------------
--------
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Marquez, Chris
Sent: Monday, August 08, 2005 11:01 AM
To: Oracle Discussion List
Subject: RE: OCFS2


>>some of you complain about OCFS
>>and make all kinds of dispariging comments
>>please get technical because I fail to find
>>any value in a posting that just bitches

OCFS is slow(er), that is a fact...and slower than RAW when promoted as
begin equal to RAW!?

When going to a new server with more disk (7 spindles vs. 4) and more
powerful CPU, more RAM we find;
 - Very, very server waits is for disk IO...even when db activity is
moderate.
 - Most of, or often the oracle waits is for disk IO...when activity is
higher.

 - RMAN backup is 3 x's longer or more (than a server with only 3
datafile EXT3 disks).
 - RMAN restore is 4 x's longer (than a server with only 3 datafile EXT3
disks).

 - Can not (and is not recommended by Oracle) to archive to
OCFS...massive oracle waits when we did.
(This is a real impact because it means that RMAN scripts can *NOT* see
all arch logs as local/shared and *must* get logs from there respective
local arch dest...this is a huge risk on node failure and recovery is
needed.)

All of the things I reference above I have not seen once but many, many
times over and over again.

I have personally spent months trying to better or system (disk config)
for OCFS use...and we have made improvements, but reality is that our db
ran much faster "WITH LESS HARDWARE" when we used EXT3.

We had RAID5, then went to RAID1, move arch logs to local EXT fs,
re-laid out every datafile on each disk based on optimal application
data access.

The reality is that this db will run immediately faster if we got off
OCFS.

Again, I know the reality is "you get what you pay for"...and for OCFS
we have paid nothing in $$$, but plenty in time!

This is not a bash of OCFS, but the reality...my reality...I see it and
live with it every day.
OCFS works and has strong management benefits over RAW, but don't kind
yourself about it equality to other filesystems.

Chris Marquez
Oracle DBA


 

------------------------------------------------------------------------
--------
------------------------------------------------------------------------
--------
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Marquez, Chris
Sent: Monday, May 09, 2005 12:31 PM
To: zhai_jingmin@xxxxxxxxx; BRUDER ING. Daniela; oracle-l@xxxxxxxxxxxxx
Subject: RE: RMAN backup to OCFS failed

>> Anyone encounter these problems?
...
>> 'drop tablespace xxx including contents and datafiles;' 
>> ORA-01265: Unable to delete LOG /U01/oradata/test01.dbf
>> ORA-27056: could not delete file
>> Linux Error: 16: Device or resource busy

Yes!

My ENV:
Red Hat Enterprise Linux ES release 3 (Taroon) Oracle 9205 Oracle
Cluster Manager 9205 Oracle OCFS-Oracle Cluster FileSystem 1.0.13-PROD1:
ocfs-2.4.21-EL-smp-1.0.13-1
ocfs-support-1.0.10-1
ocfs-tools-1.0.10-1

I don't want to bash OCFS because it works very well for the price we
paid for it (zero $$).
But it have its administrative issues...specifically I say
administrative because we had not had bug issues so far.

Performance is not great...much slower than a OS buffer filesystem like
EXT3...and IMHO slower than RAW.
 - Archive logging to OCFS is not recommend as stated by Oracle clearly.
 - RMAN Backup & Restore to OCFS is painfully slow.
 - Cutting a disk (array) into two partitions OCFS and another (i.e.
EXT3) is a really bad idea as both will perform slower.

 - And finally and to your question, "Yes" this seem to be the normal
behavior of OCFS.

Oracle + OCFS for some reason is incapable of removing the files itself
and this is a valid option for Oracle9i and works with other
OS/Filesystem...for us it just hangs.  We had tried to "rm" the files
manually that had clearly eliminated the from the database (data
dictionary) and the instance died/terminated when we tried to remove old
datafile...some process still hanging on to it.  (mv, cp, rm, commands
are not directly support against OCFS...see comment below).

It seems files size has something to do with it.  Our "work around" is
(sadly) this.

Say files /o04/oradata/orcl/index_01.dbf was 1GB and part of the index
tablespace.
We drop the tablespace and the datafile from the database but the file
remain on OCFS.

Next we create a new tablespace reusing the same file but much smaller!
SQL> create tablespace to_drop datafile
  '/o04/oradata/orcl/index_01.dbf' size 1m reuse,
  '/o04/oradata/orcl/index_02.dbf' size 1m reuse extent management local
uniform size 10k

The we drop the new *small* tablespace and datafiles and OCFS seems to
be ok doing this.
SQL> drop tablespace to_drop including contents and datafiles

>>||Version 1.1.2 solved some of our ocfs-problems under SLES8.
I would like to know if Version 1.1.2 solved this!?

There is some every good OCFS do's and don't's active out on Oracle.com
that you *should* read.
I your simply can find them email me directly and I will send them
directly; OCFS - RAC - RHAS_best_practices.htm

OTN - OCFS - Talking Linux - Update on OCFS by Wim Coekaerts.htm

METALINK - OCFS - Update on OCFS for Linux.htm METALINK - OCFS -
Supported and Recommended File System on Unbreakable Linux.htm METALINK
- OCFS - Comparing Performance Between RAW IO vs OCFS vs EXT2-3.htm
METALINK - OCFS - Oracle Cluster File System-OCFS Red Hat AS - FAQ.htm
METALINK - OCFS - FAQ Oracle Cluster File System OCFS on RedHat Advanced
Server.htm METALINK - Linux OCFS - Best Practices - Red Hat Advanced
Server.htm

ocfs_oracleworld.ppt

=====================================================
oss.oracle.com
=====================================================
http://oss.oracle.com/

Welcome to oss.oracle.com
This is the home of Oracle's Linux Projects development group. 
Our focus is to enhance and improve Linux in the interest of making
Oracle products perform better, faster, and more reliably. 



-----------------------------------------------------
OCFS Users Guide
-----------------------------------------------------
http://oss.oracle.com/projects/ocfs/dist/documentation/OCFS_Users_Guide.
doc 


hth

Chris Marquez
Oracle DBA








-----Original Message-----
From: Marquez, Chris 
Sent: Monday, October 03, 2005 11:13 AM
To: David Sharples
Cc: mueller_m@xxxxxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: Oracle RAC backup hardware/software recommendation?

Dave,

I can not find the document so I have to take that comment
back...although I'm sure I have seen it?
To be honest, I never did nor tried this.

        ---METALINK
        Doc ID:         Note:252331.1
        Subject:        Update on OCFS for Linux

        ---oss.oracle
        
http://oss.oracle.com/projects/ocfs/dist/documentation/RHAS_best_practic
es.html 
        For optimal performance, each node should have its own, separate
OCFS archive log partition. 

However, below are old docs regarding OCFS (arch log) issues...and note
I had all my problems with (more recent) OCFS version 1.0.12.x or
1.0.13.x.

        ---METALINK
        3146671-OCFS ARCHIVE DESTINATION NOT RESPONDING TO UNIX COMMANDS

        ---METALINK Forum [This one gets ugly]
        Real Application Clusters/High Availablity Technical Forum
        Thread Status: Closed
        Subject: Linux OCFS causes kernel panic on Red Hat AS 

I (gladly) forgot about some of the ugly "all node server hangs" we had
when doing simple OS commands (ls -l) on OCFS during less than heavily
server/db load.  It was so bad and we so scared, that no one would
remove any arch logs from the filesystem except during maintenance
outages.  Also, we had *current* versions of all the OCFS tools and
utilities... fileutils (O_DIRECT).

Oh yeah...is all coming back to me now...we had server hangs during the
RMAN archive log backup and "deletes" too!  This was more than enough to
lead us off OCFS for arch logs.

Again, I can not find the actual Oracle supplied reference, but from
personal experience it would take a lot to get me to use OCFS for arch
logs and likely my other client would never go for it (again).

From overall OCFS experience I have not doubt (plus personal testing)
that with the benefit of OCFS comes a large performance penalty
(overall, not jsut arch log issues)...it doesn't perform as well as
RAW...as expressed.

hth

Chris Marquez
Oracle DBA






-----Original Message-----
From: Marquez, Chris 
Sent: Monday, October 03, 2005 10:08 AM
To: 'mueller_m@xxxxxxxxxxxxx'; oracle-l@xxxxxxxxxxxxx
Subject: RE: Oracle RAC backup hardware/software recommendation?

>> What should be the conclusion?
>> Put redo logs and archived redo logs on OCFS or not?? 

I was so excited at the opportunity at using OCFS for our archive logs
and thus making my RMAN scripting that much simpler.
This didn't last 30 day.  

We had a totally isolated disk for arch logs (on ocfs).  Performance was
horrible...a huge bottle neck.
Trust me is was not the hardware (design).  As soon as we went to EXT3
file system the arch log bottle was gone.

BTW, I have seem Oracle docs the recommend *NOT* putting arch logs on
ocfs.

hth

Chris Marquez
Oracle DBA


-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of manuela mueller
Sent: Friday, September 30, 2005 3:49 AM
To: oracle-l@xxxxxxxxxxxxx
Subject: RE: Oracle RAC backup hardware/software recommendation?

Dear all,
thanks for your contributions to this thread so far, very interesting
discussion.

One note about the recover scenario Chris mentioned.
I totally agree with the problems you are likely to face if you run
RMAN-MML clients on more than one node (which one probably does in a RAC
environment).
Your life may be a bit easier if you can put (archived) redo log files
on OCFS.

There's a document at metalink 'OCFS Best Practives' which deals with
files you can put on OCFS:
URL:
http://metalink.oracle.com/metalink/plsql/ml2_documents.showDocument?p_d
atabase_id=NOT&p_id=237997.1

<quote>
3. File Types Supported by OCFS


      At this time (version 1.0.9), OCFS only supports Oracle data files
      - this includes redo log files, archive log files, controlfiles
      and datafiles. OCFS also supports the Oracle Cluster Manager (OCM)
      shared quorum disk file and shared Server Configuration file (for
      svrctl). Support for shared Oracle Home installation is not
      currently supported, but expected in the latter part of 2003 (OCFS
      v2.x?).

</quote>

Unfortunately this pargraph does not cover the fragmentation issues with
OCFS.
A bit later on the same document:

<quote>
7. Defragmentation
Given the extent-based allocation of volume metadata and user file data
clusters, it is possible for disk fragmentation to occur. The following
guidelines list measures to prevent volume fragmentation:

OCFS requires contiguous space on disk for initial datafile creation
e.g. if you create a 1Gb datafile (in one command), it requires 1Gb of
contiguous space on disk. If you then extend the datafile by 100Mb, it
requires another 100Mb chunk of contiguous disk space. However, the
100Mb chunk need not fall right behind the 1Gb chunk.
- Avoid heavy, concurrent new file creation, deletion or extension,
particularly from multiple nodes to the same partition.
-Attempt to correctly size Oracle datafiles before creation (including
adding new datafile to tablespaces), ensuring to allow for more than
adequate growth.
-Use a consistent extent/next extent size for all tablespaces in order
to prevent partition fragmentation (where datafiles are autoextensible).
-Separate data and index datafiles across separate OCFS partitions.
-Separate archive logs and redo log files across separate OCFS
partitions.
-Where possible, avoid enabling datafile autoextensibility. Statically
sized datafiles are ideal to avoid defragmentation. Autoextensibility is
acceptable as long as large next extents are configured. - Where
possible, use Recovery Manager (RMAN), particularly restoration - RMAN
writes in o_direct mode by default.

</quote>

Another document at metalink 'RAC FAQ':
<quote>

What files can I put on Linux OCFS?
- Datafiles
- Control Files
- Redo Logs
- Archive Logs
- Shared Configuration File (OCR)
- Quorum / Voting File
- SPFILE
/Modified: 14-AUG-03    Ref #: ID-4156/
</quote>

What should be the conclusion?
Put redo logs and archived redo logs on OCFS or not??
This question was 2 years ago repeatedly asked in the metalink RMAN
forum, but not directly answered.

Have a nice day
Manuela Mueller



>But yes this is exactly what I mean.
>Its hit the DBA smack in the face when one tries to
>        RMAN>restore archive log all;
>in a RAC environment where the "local" arch logs are backed up
independently on each server (instance) in the RMAN backup
session/script.  The logs belong to one database, but two MML clients.
>The RMAN-MML restore session will blow up because it is restoring the
logs for one client only at a time, but the RMAN command was for ALL
logs (from any instance...incompatible).
>I would set up each TDPO client database server to be able to "spoof"
the other RAC server at anytime.
>That meant duplicate config files on each client RAC server.  So at
anytime I could restore (RMAN) backup from node A to node A, and backup
from node B to node A...thus getting all of my arch logs file in a
failure.
>
>You don't want to  *learn* this in the heat of battle, but not is not
intuitive during the set up.
>Again, #1 Test, Test, Test complete and total loss database restores. 
Then and only them "missed" issue become obvious.
>
>Thus a strong argument can be made for cluster filesystem for arch logs
(Which ocfs does not support/recommend).

--
//www.freelists.org/webpage/oracle-l
--
//www.freelists.org/webpage/oracle-l

Other related posts: