RE: Memory pressure on Cloud at customer solution with EXADATA

From: <dimensional.dba@xxxxxxxxxxx>
To: <andrew.kerber@xxxxxxxxx>, <jack@xxxxxxxxxxxx>
Date: Mon, 20 May 2019 19:33:54 -0700

If you are getting OOM messages then OOM killer could have kicked in and the VM
will crash. This evening happens on Physical servers crashing the physical
server OS.

From: oracle-l-bounce@xxxxxxxxxxxxx <oracle-l-bounce@xxxxxxxxxxxxx> On Behalf
Of Andrew Kerber
Sent: Monday, May 20, 2019 7:28 PM
To: jack@xxxxxxxxxxxx
Cc: Chris Taylor <christopherdtaylor1994@xxxxxxxxx>; ORACLE-L
<oracle-l@xxxxxxxxxxxxx>
Subject: Re: Memory pressure on Cloud at customer solution with EXADATA

Normally the oracle alert logs will say something about swapping if it’s
happening.

Sent from my iPad

On May 20, 2019, at 19:55, Jack van Zanen <jack@xxxxxxxxxxxx
<mailto:jack@xxxxxxxxxxxx> > wrote:

The whole VM crashed, and that is exactly my point as well...a bit strange to
crash the whole VM with 200G for OS and processes.

Jack van Zanen

-------------------------
This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient. If you are not the intended recipient, please be
aware that any disclosure, copying, distribution or use of this e-mail or any
attachment is prohibited. If you have received this e-mail in error, please
contact the sender and delete all copies.
Thank you for your cooperation

On Tue, May 21, 2019 at 10:28 AM Chris Taylor <christopherdtaylor1994@xxxxxxxxx
<mailto:christopherdtaylor1994@xxxxxxxxx> > wrote:

You should be seeing OOM messages and killed processes before a node crash i
would think.  Did the whole vm crash, or did the db crash only?  If I
understood you correctly, the whole vm crashed.

That seems like an odd scenario for an OOM condition - especially with 200GB
left for OS and processes.

Chris

On Mon, May 20, 2019, 6:54 PM Jack van Zanen <jack@xxxxxxxxxxxx
<mailto:jack@xxxxxxxxxxxx> > wrote:

Hi All,

Oracle 12.2.0.1

PSU Jan 2019

Kernel version: 4.1.12-124.24.3.el6uek.x86_64 #2 SMP Mon Jan 14 15:08:09 PST
2019 x86_64

we have a 2 node RAC/Exadata solution and have configured about 500GB of 720GB
memory for Huge Pages.

Now this is over configured at the moment as we have decommissioned  some
databases. This is our combined TEST/DR environment and we could spin up more
containers at any time.

Last week we had node 1 crash and oracle support came back saying it was due to
memory pressure and we need to configure fewer Huge Pages.

While in theory that makes sense but I am struggling to understand why a system
with 200G+ memory not allocated to huge pages is crashing due to memory issues.
I have checked our alert logs and the databases start using huge pages.

Anyone here can explain to me how this can happen, as oracle support keep
repeating to lower the configured huge pages.

Jack van Zanen

-------------------------
This e-mail and any attachments may contain confidential material for the sole
use of the intended recipient. If you are not the intended recipient, please be
aware that any disclosure, copying, distribution or use of this e-mail or any
attachment is prohibited. If you have received this e-mail in error, please
contact the sender and delete all copies.
Thank you for your cooperation

References:
- Memory pressure on Cloud at customer solution with EXADATA
  - From: Jack van Zanen
- Re: Memory pressure on Cloud at customer solution with EXADATA
  - From: Chris Taylor
- Re: Memory pressure on Cloud at customer solution with EXADATA
  - From: Jack van Zanen
- Re: Memory pressure on Cloud at customer solution with EXADATA
  - From: Andrew Kerber

RE: Memory pressure on Cloud at customer solution with EXADATA

Other related posts: