Re: RMAN innocent bystanders killed on linux

  • From: "Rajeev Prabhakar" <rprabha01@xxxxxxxxx>
  • To: oracle-l@xxxxxxxxxxxxx
  • Date: Thu, 28 Feb 2008 15:03:19 -0500

Hello,

Given the experience we have had recently, I am not 100% sure if this
issue is merely confined to 2.4
kernels. Just to share our recent experience...

Few weeks back we were facing instance crashes on a rac cluster
(10.2.0.3, linux 2.6.9-55.0.6.0.1.ELsmp)
encountered only during the rman runtime window and subsequent
troubleshooting / research led to reducing
the parallelism / filesperset for the rman configuration. That has so
far avoided the zero memory/swap
scenario we saw in some oracle trace files and we haven't had any
instance crashes during rman backup
window since then. Although, o.s. utilities had continued to show a
relatively "normal" system from a
memory /swap stand point during those problematic rman backup window
times. So, given what we have
seen, I would agree w/Christo that it is an issue associated with
large/heavy i/o operations/filesystem cache.

-Rajeev

On Thu, Feb 28, 2008 at 1:38 PM, Christo Kutrovsky
<kutrovsky.oracle@xxxxxxxxx> wrote:
> Hello,
>
> This is known issue with 2.4 kernels. It's not so much to do with low
> memory, but incorrect memory counting from the OOM module.
> It is related with large file io operations, which use a lot of file
> system cache.
>
> Enable DIRECTIO (filesystem_options=directio). In 2.4 kernel you have
> either DIRECTIO or ASYNC for ext3 (I am assuming you are using ext3).
> Not both, if you do "setall" async will take precedence.
>
> Note that this will only help you with your duplicate. If you start a
> "cp" someone will get killed. I believe there's a bugfix for the 2.4
> kernel. Make sure you are using latest 2.4 kernel.
>
> If you really need more info, I can try to lookup the kernel that had
> this issue, and the kernel that did not.
>
>
> --
> Christo Kutrovsky
> DBA Team Lead
> The Pythian Group - www.pythian.com
> I blog at http://www.pythian.com/blogs/
--
//www.freelists.org/webpage/oracle-l


Other related posts: