SOLVED: ASM installation problems: re: Further adventures in the world of RAC

After several hours of reading blogs, metalink notes, corresponding with
some of you and trying to get the Oracle Support analyst to understand our
predicament, we took a leap of faith and tried out metalink note *268481.1*.
It worked!

As I mentioned in the original post, the root issue was that ASM was
scanning all disks, and, because of multipathing, was finding some issues.
Fixing /etc/sysconfig/oracleasm solved the root problem, and fortunately the
cleanup was rather simple as well. Just a little exercise for those of us
who have never done anything with ASM or RAC. =)

To resolve the problem with ASM, we used metalink note *268481.1* as a
reference (had to adapt it for ASMLib a little):
- shutdown ASM
- zero out the raw disks with dd
- reconfigure oracleasm (/etc/init.d/oracleasm configure ## as root)
- recreate asm disks (/etc/init.d/oracleasm createdisk <label> <disk>)
- scan asm disks on other nodes (/etc/init.d/oracleasm scandisks)
- startup one of the ASM instances, create a diskgroup (like the note says,
but use the path from oracleasm/asmlib)
- startup other ASM instance(s)

We also added a parameter to the ASM pfiles (asm_diskstring='ORCL:*').

We were very happy when that worked, and now we can move on.
Thanks to everyone who responded.

On 6/27/07, Charles Schultz <sacrophyte@xxxxxxxxx> wrote:

I found Bill Wagman's 
thread<http://www.freelists.org/archives/oracle-l/04-2007/msg00205.html>to be 
quite interesting, especially since we are hitting a similar problem
now. I filed the SR (6395660.992) first in the hopes that someone at
Oracle Support will eventually get around to it. =)

Hardware/software:
RHEL4 (Nanant) running on 2 Dell 2950's, gigabit nic
Stock Oracle 10.2.0.1 for clusterware and asm oracle home

CRS install went just dandy, including all of K Gopal's suggest cluvfy
steps. The ASM install was going nicely until I hit the DBCA, at which point
the ASM instance on node 2 could not start because it could not mount the
disks. After digging around and finding Varga's 
blog<http://blogs.oracle.com/AlejandroVargas/newsItems/viewFullItem$126>,
I did some homework with my sysadmin and configured /etc/sysconfig/oracleasm
to exclude the scsi drivers from our linux multipath ( not EMC PowerPath)
scans. So, how do I get my ASM instance to mount the disks after solving
that little riddle? I am looking through K Gopal's book (and now we have
Julian Dyke's book as well); I am sure the answer is in there somewhere, but
if someone wants to give me a little hint as to what page, that would be
cool. *grin*

For those interested in nitty-gritty details, here is the current error:
SQL> ALTER DISKGROUP ALL MOUNT;
ALTER DISKGROUP ALL MOUNT
*
ERROR at line 1:
ORA-15032: not all alterations performed
ORA-15063: ASM discovered an insufficient number of disks for diskgroup
"DATA"

From alert.log
NOTE: cache registered group DATA number=1 incarn=0x0e9a9f02
Wed Jun 27 17:08:20 2007
ORA-15186: ASMLIB error function = [asm_open],  error = [1],  mesg =
[Operation not permitted]
Wed Jun 27 17:08:20 2007
ORA-15186: ASMLIB error function = [asm_open],  error = [1],  mesg =
[Operation not permitted]
Wed Jun 27 17:08:20 2007
ERROR: no PST quorum in group 1: required 2, found 0
Wed Jun 27 17:08:20 2007
NOTE: cache dismounting group 1/0x0E9A9F02 (DATA)
NOTE: dbwr not being msg'd to dismount
ERROR: diskgroup DATA was not mounted

I have also looked at Metalink notes:
*391136.1
* *309815.1
**353423.1 - this one might be helpful, but since node1 appears to be ok,
I am not so sure about dropping the diskgroup
* 398622.1 - I saw a reference for this, but cannot find the note itself.

TIA. Hopefully, I am missing something obvious.


--
Charles Schultz

Other related posts: