RE: EM12c incidents when restarting OMS

  • From: Garry Chen <gc92@xxxxxxxxxxx>
  • To: ORACLE-L <oracle-l@xxxxxxxxxxxxx>
  • Date: Fri, 20 Jun 2014 18:14:06 +0000

I do see the same behaving as Mike posted.  When my OMS(12.1.0.3) restart we 
got "Agent is unable to communicate with the OMS" critical event and it should 
be a correct reaction .  I think the solution is do a block out before the 
shutdown and unblock after the start.  You should be able to write a script to 
do that.

Garry

From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On 
Behalf Of Brian Pardy
Sent: Friday, June 20, 2014 1:56 PM
To: Michael Schmitt; ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Very interesting.  I'm running 12.1.0.4 but I don't think that is the cause of 
what I see behaving differently.

I've just tried several tests and I can't replicate this behavior.  I only 
receive alerts for targets on the OMS host.

By subscribing to every metric alert event and every incident create/change and 
every out-of-the-box rule, I received about 150 email alerts from one OMS 
bounce, but every single alert was for a target running on the OMS host or 
monitored by the central agent.  Even with all that enabled, I did not receive 
even one alert for a target on a separate managed host.  This bothers me 
because I remember having the same problem you described and now I don't know 
what fixed it.

If there's a path to solving this, it probably involves taking a look at one of 
your notifications to identify the exact incident rule triggering the alert, 
then studying the exact criteria that you have on that rule.  I wonder if 
there's a bug on your side causing too many notifications or a bug on mine 
keeping them from going out.


From: Michael Schmitt [mailto:mschmitt@xxxxxxxxxxxx]
Sent: Friday, June 20, 2014 12:13 PM
To: Brian Pardy; ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Hi Brian,

We are on 12.1.0.3 which I think is the latest version, but I could be wrong 
about that.  I had already completed steps 1 and 2.  It is step 2 that is 
firing off all the incidents for us.

The alert messages all seem to be tied to the agents (not the agent on the 
oms).  They fire "Agent is unable to communicate with the OMS. (REASON = Agent 
is Unreachable)".  That fires for all the servers we have agents on.  Then we 
get a flood of Target incidents (agent, host, listener,Instance) which we have 
set to trigger on an Agent unreachable Availability check.  Once we start the 
OMS back up, everything then clears out

I will check that MOS note you referenced

Thanks,
Mike


From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx> 
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Brian Pardy
Sent: Friday, June 20, 2014 10:22 AM
To: ORACLE-L
Subject: RE: EM12c incidents when restarting OMS

Hi Mike,

I used to have a similar issue.  Could you please confirm which version/release 
of EM12c you use?

My fix for this was as follows:

1)      Unsubscribe from the out-of-the-box incident rules in the EM12c.  The 
out-of-the-box rules (at least on EM12c R1, I have not checked recently) 
include notifications for EM12c components (WebLogic domains and servers and so 
on) that I do not wish to receive.

2)      Create custom incident rules for all the targets, target types and 
incident categories for which you DO wish to be notified (eg Database System, 
Listener, MySQL Instance, Agent) and subscribe to these instead.

3)      Configure OOB monitoring for the oracle_emrep OMS and Repository 
target, per MOS note 1472854.1.

I haven't seen an actual crash in my EM12c environment for which I would have 
needed notification since R1+BP1.  Using this setup, and a script that shuts 
down my central agent along with the OMS when I need to bounce EM12c, I do not 
receive any notifications at all from a planned bounce.  I still monitor the 
repository database target just like my other non-OEM databases, but I do not 
monitor any of the internal EM12c components.  This may be more complicated if 
your environment includes other WebLogic targets (mine doesn't, so I ignore 
them all), but you can create target groups to include/exclude specific targets 
from notifications as needed.




Original message:

From: oracle-l-bounce@xxxxxxxxxxxxx<mailto:oracle-l-bounce@xxxxxxxxxxxxx> 
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Michael Schmitt

Hello,

Does anyone else who uses 12c cloud control have the issue where you get a ton 
of incidents firing off if you shutdown the OMS?

We have incident notifications setup related to the availability of targets, 
and whenever we try to shutdown the OMS we get a ton of pages and emails as a 
result of these.  Was thinking there has to be a way to prevent them from 
kicking off, but haven't found a good answer yet.

Does anyone know a command or workaround for this?

Thanks in advance for the help
Mike

Other related posts: