for those of you running ASM on RHEL4, you might want to make sure
the scanorder is setup properly in /etc/sysconfig/oracleasm.
I got called last week about a node being down on 2-node RAC SE with
ASM.
node 1 was crashed and node 2 was running.
the error in the logs said it had a corrupt control file. After
looking into it a little bit more, I saw there was emergency SAN
maintenance done around the same time.
suspicious, I kept looking around. It shouldn't have been a problem
because we have redundancy built into it (dual HBAs, multiple paths
to the EMC array...)
ASM was up and running, but trying to start the instance generated a
bunch of errors.
Oracle said the controlfile was gone, but I didn't believe it because
node2 was working.
the SAN admin and myself were talking about and I mentioned that we
point to the pseudo device and not the real paths.
That made want to go and double check. I looked at node 2 and it was
using "emcpowera1" for asm.
I went over to node1 (crashed node) and it was using "sdb1" ( which
was a path SPB that was replaced during maintenance).
I looked at the oracleasm_scanorder and it was missing from node1.
node2 had it set to "emcpower sd".
made the additional change and rebooted it. asm used it, but the
cluster wouldn't come up. Oracle said it was in a split brain
condition. I rebooted both nodes, ASM chose the emc power devices for
both and both instances came up - no corrupt controlfile.
save yourself some trouble and verify it.
/etc/init.d/oracleasm listdisks
/etc/init.d/oracleasm querydisk <name from previous cmd>
ls -l /dev | grep 'xxx, xx'