Re: Really Strange Problem

From: Andrew Kerber <andrew.kerber@xxxxxxxxx>
To: harish.kumar.kalra@xxxxxxxxx
Date: Thu, 11 Nov 2010 22:50:30 -0600

Absolutely no indication of a node eviction.  Nothing in any of the
clusterware logs indicates a node eviction on either node. (crsd.log,
ocssd.log, etc)  They are all normal until they suddenly start back up after
an unexpected shutdown.

On Thu, Nov 11, 2010 at 9:36 PM, Harish Kumar
<harish.kumar.kalra@xxxxxxxxx>wrote:

> John,
>
> Have you checked ocssd.log and system logfiles. Download and installe CHM
> also know as Cluster Health Monitor and let it running until node evicts
> again.
>
> Once nodes are evicted check and analyze logfiles collected by CHM. Oracle
> may evict node for different reasons such as CPU saturation, longer IO
> latencies, missconfigured network etc.
>
> I think once you have logfiles in place then it will be more clearer what
> the actual problem is.
>
> Reagrds
> Harish Kumar
> Independant Database Consultant
>
> www.oraxperts.com
>
>
>
> On Fri, Nov 12, 2010 at 1:20 PM, John Smith <john40855@xxxxxxxxx> wrote:
>
>> Oh yes, if I didnt make it clear:
>>
>> OS - OEL 5.5 x86_64
>> Clusterware:  11.1.0.7 x86_64
>> ASM - 11.1.0.7 x86_64 (running over RAW)
>> Database: 10.1.0.5 x86_64 (running)
>> Database: 10.2.0.4 x86_64 (installed, but not running at this point)
>>
>>
>> ---------- Forwarded message ----------
>> From: John Smith <john40855@xxxxxxxxx>
>> Date: Thu, Nov 11, 2010 at 8:14 PM
>> Subject: Really Strange Problem
>> To: oracle-l@xxxxxxxxxxxxx
>>
>>
>> OK, I don't know if this one is related to oracle database, OEL, or
>> something else entirely.  But here it is:
>>
>> We have oracle clusterware 11.1 installed and running with asm 11.1.  We
>> also have oracle 10.2 installed, as well as 10.1.  I have created a 10.1
>> database.  ASM is on RAW against EMC storage.  This has to be on raw because
>> the intent is to take 10.1, 32 bit database to 10.2 64 bit.  This requires a
>> stop at 10.1 64 bit along the way, and 10.1 reqires ASM on raw.
>>
>> Anyway, the problem is that the servers are rebooting every 2-3 days at
>> 2:15 am, and we have not been able to figure out why.  There is nothing in
>> the ASM or clusterware or database logs, they show everything running fine
>> then a restart.  Nothing in /var/log/messages.  Just shows a restart.  Any
>> ideas?
>>
>>


-- 
Andrew W. Kerber

'If at first you dont succeed, dont take up skydiving.'

Follow-Ups:
- Re: Really Strange Problem
  - From: Kevin Closson

References:
- Really Strange Problem
  - From: John Smith
- Fwd: Really Strange Problem
  - From: John Smith
- Re: Really Strange Problem
  - From: Harish Kumar

Re: Really Strange Problem

Other related posts: