Re: ASM Instance Not Up Due to Private IP mismatch between Nodes

  • From: m.mudhalvan@xxxxxxxxxxxxxxxx
  • To: Andrew Kerber <andrew.kerber@xxxxxxxxx>
  • Date: Tue, 22 May 2012 09:44:01 +0900

Hi Andrew and Experts,
        Thanks for your response. 

        While in a Deep Dive I think it is a Bug 

        Oracle Note id 1374360.1 and Bug# 12425730

        We found the error messages matched. Now checking with Oracle 
Support to confirm it. 

        I have the following information in my logs as mentioned in Bug 
Note.

$GRID_HOME/log/<hostname>/gipcd/gipcd.log
gipcd.l03:2012-05-19 13:38:37.212: [GIPCDMON][1101535552] 
gipcdMonitorSaveInfMetrics: inf[ 0]  eth1                 - rank   -1, 
avgms 30000000000.000000 [ 0 / 0 / 0 ]

$GRID_HOME/log/<hostname>/agent/ohasd/orarootagent_root/orarootagent_root.log

orarootagent_root.l02:2012-05-19 11:42:47.383: [ora.diskmon][1122298176] 
{0:0:2} [check] DiskmonAgent::check {
orarootagent_root.l02:2012-05-19 11:42:47.383: [ora.diskmon][1122298176] 
{0:0:2} [check] DiskmonAgent::check } - 0
orarootagent_root.l02:2012-05-19 11:42:48.586: [CLSFRAME][1146118256] TM 
[MultiThread] is changing desired thread # to 5. Current # is 4
orarootagent_root.l02:2012-05-19 11:42:48.587: [    AGFW][1111791936] 
{0:0:2} Created alert : (:CRSAGF00113:) :  Aborting the command: start for 
resource: ora.cluster_interconnect.haip 1 1
orarootagent_root.l02:2012-05-19 11:42:48.587: 
[ora.cluster_interconnect.haip][1111791936] {0:0:2} [start] 
clsn_agent::abort {
orarootagent_root.l02:2012-05-19 11:42:48.587: 
[ora.cluster_interconnect.haip][1111791936] {0:0:2} [start] abort {
orarootagent_root.l02:2012-05-19 11:42:48.587: 
[ora.cluster_interconnect.haip][1111791936] {0:0:2} [start] abort command: 
start
orarootagent_root.l02:2012-05-19 11:42:48.587: 
[ora.cluster_interconnect.haip][1111791936] {0:0:2} [start] tryActionLock 
{
orarootagent_root.l02:2012-05-19 11:42:48.587: [ USRTHRD][1109690688] 
{0:0:2} Thread:[NetHAMain]stop {
orarootagent_root.l02:2012-05-19 11:42:48.702: [ USRTHRD][1099069760] 
{0:0:2} [NetHAMain] thread stopping
orarootagent_root.l02:2012-05-19 11:42:48.702: [ USRTHRD][1099069760] 
{0:0:2} Thread:[NetHAMain]isRunning is reset to false here
orarootagent_root.l02:2012-05-19 11:42:48.703: [ USRTHRD][1109690688] 
{0:0:2} Thread:[NetHAMain]stop }
orarootagent_root.l02:2012-05-19 11:42:48.703: [ USRTHRD][1109690688] 
{0:0:2} thread cleaning up
orarootagent_root.l02:2012-05-19 11:42:48.914: 
[ora.cluster_interconnect.haip][1109690688] {0:0:2} [start] Start of HAIP 
aborted
orarootagent_root.l02:2012-05-19 11:42:48.915: [   AGENT][1109690688] 
{0:0:2} UserErrorException: Locale is 
orarootagent_root.l02:2012-05-19 11:42:48.915: 
[ora.cluster_interconnect.haip][1109690688] {0:0:2} [start] 
clsnUtils::error Exception type=2 string=
orarootagent_root.l02:2012-05-19 11:42:48.915: [    AGFW][1109690688] 
{0:0:2} sending status msg [CRS-5017: The resource action 
"ora.cluster_interconnect.haip start" encountered the following error: 
orarootagent_root.l02:2012-05-19 11:42:48.915: 
[ora.cluster_interconnect.haip][1109690688] {0:0:2} [start] 
clsn_agent::start }
orarootagent_root.l02:2012-05-19 11:42:48.915: [    AGFW][1113893184] 
{0:0:2} Agent sending reply for: 
RESOURCE_START[ora.cluster_interconnect.haip 1 1] ID 4098:333


Thanks & Regards
Mudhalvan M.M




From:   Andrew Kerber <andrew.kerber@xxxxxxxxx>
To:     anelson77388@xxxxxxxxx
Cc:     1326914 MUDHALVAN.MUNISWAMY/AOZORABANK@AOZORABANK, 
Oracle-L@xxxxxxxxxxxxx
Date:   05/21/2012 10:10 PM
Subject:        Re: ASM Instance Not Up Due to Private IP mismatch between 
Nodes



Seems like I ran into this once before.  Try explicitly setting the 
cluster interconnect in the spfile or pfile.

On Mon, May 21, 2012 at 7:56 AM, Allan Nelson <anelson77388@xxxxxxxxx> 
wrote:
The 169.254 addresses are what are termed zero conf addresses.  You can
google for more information on that topic if you are interested.  They are
being provided by a new feature of clusterware in 11.2.0.2 whoose name
escapes me at the moment.  They are being used because your 192.168
addresses have different subnet masks.  eth1 has a subnet mask of
255..255.255.0 and eth2 has a subnet mask of 255.255.0.0.  When 
clusterware
came up after the boot it detected this misconfiguration.  IP's that have
different subnet masks can't talk and so clusterware provieded addresses 
it
could use.
The RAC will run on these addresses without problems but my recommendation
to you would be to fix the misconfiguration of the 192.168 addresses and
restart your rac.  It is messy to leave them misconfigured and you seem to
want your interconnect to be on 192.162 anyway.

Allan

On Mon, May 21, 2012 at 3:57 AM, <m.mudhalvan@xxxxxxxxxxxxxxxx> wrote:

> Gurus,
>        Good Morning. We had two node RAC instance on 11g Rel 2 
(11.2.0.2)
>
>        Last Saturday we had some maintenance which involved the restart
> of the Instance including ASM instance. There is no Change in DB or ASM
> Side.
>
>        When we tried to stop the ASM instance it failed and later it
> aborted by the cluster service stop command internally.
>
>        When we bring the Node 1 again ASM instance is not started and
> keep getting terminated. When we closely check the alert log of both ASM
> Nodes found the private IP address is not matching. It was good until 
the
> restart . Since IP are not matching the Disk Groups are not getting
> mounted and it made us to restart the server then both ASM alert log
> showed as private interconnect IP as 169.254.x.x and everything is 
working
> fine.
>
>        What is my questions to gurus/experts are
>
>                1. What might caused to change private interconnect IP
> segment even though we have specified the private interconnect segment 
as
> 192.168.x.x?
>
>                2. Do we have any problem since both nodes are running on
> private Interconnect IP segment 169.254.x.x?
>
>
> Node 1: ASM Alert Log
> Private Interface 'eth1' configured from GPnP for use as a private
> interconnect.
>  [name='eth1', type=1, ip=192.168.X.X,  net=192.168.X.0/24,
> mask=255.255.255.0, use=cluster_interconnect/6]
>
> Node 2: ASM Alert Log
> Private Interface 'eth1:1' configured from GPnP for use as a private
> interconnect.
>  [name='eth1:1', type=1, ip=169.254.X.X, net=169.254.X.X/16,
> mask=255.255.0.0, use=haip:cluster_interconnect/62]
>
> Alert Log in Databases - Occurred multiple times
>
> CRS-2302:Cannot get GPnP profile. Error CLSGPNP_NO_DAEMON (GPNPD daemon 
is
> not running)
>
> CRS-5019:All OCR locations are on ASM disk groups [DG_DATA01], and none 
of
> these disk groups are mounted.
>
>
>
> Thanks & Regards
> Mudhalvan M.M
>
> Infrastructure Management Division
> Tel. 042(319)4516 Ext. 34516
> Mobile 81-80-4890-1973
> Email m.mudhalvan@xxxxxxxxxxxxxxxx
>
> 
----------------------------------------------------------------------------------------------------
> $B!V(B $B$=$N@h$O!"$"$*$>$i!#(B $B!W(B
> $BCm!'$3$N(BE-mail$B$O!"5!L)>pJs$r4^$s$G$*$j!"H/?.<T$,0U?^$7$?(B 
> $B<u?.<T$N$_$,Mx(B
$BMQ(B
> $B$9$k$3$H$r0U?^$7$?$b$N$G$9!#K|$,0l!"5.EB(B 
> $B$,$3$N(BE-mail$B$NH/?.<T$,0U?^$7$?(B
$B<u(B
> $B?.<T$G$J$$>l9g$K$O!"$3(B 
> $B$N(BE-mail$B$N0u:~!"%3%T!<!"E>Aw$=$NB>0l@Z$N;HMQ$O6X(B
$B;_(B
> $B$5$l(B 
> $B$^$9!#EvJ}$N8m$j$K$h$j$3$N(BE-mail$B$r$*<u$1<h$j$K$J$C$?>l9g(B 
> $B$O!"$*<j?t(B
$B$r(B
> $B$*$+$1$7$^$9$,!"$3$N(BE-mail$B$rGK4~$7!"D>$A$K$4O"(B 
> $BMm$rD:BW$G$-$^$9$H9,$$$G(B
$B$9(B
> $B!#(B
>
> 
----------------------------------------------------------------------------------------------------
>
>
> --
> //www.freelists.org/webpage/oracle-l
>
>
>


--
//www.freelists.org/webpage/oracle-l





-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'


--
//www.freelists.org/webpage/oracle-l


Other related posts: