Oracle10.2.0.2/Solaris/Veritas cluster problem--caused by Veritas defrag of filesystem

  • From: "daniel koehne" <koehned@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Tue, 30 Jan 2007 09:48:43 -0500

Primary DB Configuration: a 2 node RAC cluster

Solaris: 5.9 Generic_118558-25 sun4u sparc SUNW,Netra-T12
Oracle: 10.2.0.2 RAC
Veritas Cluster: Symantec/Veritas Storage Foundation for Oracle RAC
4.1 MP1 (VERITAS-4.1_p3.1:2005-10-24)

Physical Standby: a single note non-RAC cluster

Solaris: 5.9 Generic_118558-25 sun4u sparc SUNW,Netra-T12
Oracle: 10.2.0.2 single node (cluster_database = false)
Veritas Cluster: Symantec/Veritas Storage Foundation for Oracle RAC
4.1 MP1 (VERITAS-4.1_p3.1:2005-10-24)

This system has been in production since October 1, 2006, and the Unix
Admins tell me that they have been defragmenting the Veritas cluster
filesystem (using fsadm) on these servers from before we went live.
Defragmentation starts every Sunday at 4am.

In January 2007 we have been unable to apply archive logs, recover
standby database, to the physical standby database while the
defragmenter has been running.

Here is an extract of some diagnostic data I collected from a database
perspective:

Truss Oracle session executing "recover standby database" command
shows Oracle waitng IO:

21328/1: kaio(AIOWAIT, 0xFFFFFFFFFFFFFFFF) Err#22 EINVAL
21328/1: kaio(AIOWAIT, 0xFFFFFFFFFFFFFFFF) Err#22 EINVAL
21328/1: kaio(AIOWAIT, 0xFFFFFFFFFFFFFFFF) Err#22 EINVAL

ARC traces
---------------
*** 2007-01-29 10:10:08.098
Unable to get enqueue on resource CF-00000000-00000000 (ges mode req=3 held=6)
Possible local blocker ospid=16864 sid=1087 sser=1789 time_held=1198
secs (ges mode req=6 held=4)
Killing blocker (pid=16864) on resource CF-00000000-00000000
DUMP LOCAL BLOCKER: initiate state dump for KILL BLOCKER
...
waiting for 'direct path write' blocking sess=0x0 seq=1162 wait_time=0
seconds since wait
started=420
file number=1a, first dba=1, block cnt=1
Dumping Session Wait History
for 'direct path read' count=1 wait_time=3
file number=1a, first dba=1, block cnt=1

Has anyone else experienced these types of problems?

I suppose that I am also surprised that we need to defragment the
Veritas filesystems but apparently the Veritas best practices manual
list defragmenting filesystems as a good thing.

Thanks
  Daniel
--
//www.freelists.org/webpage/oracle-l


Other related posts: