Re: Process died -- no info in trace files

  • From: Howard Latham <howard.latham@xxxxxxxxx>
  • To: tony_vanlingen@xxxxxxxxxxxxxxxxxxxxx
  • Date: Sun, 19 Apr 2009 11:23:58 +0100

if it is oom killer I have found a number of tweaks you can try - also try
hugemem kernel or even 64bit os

2009/4/17 Tony van Lingen <tony_vanlingen@xxxxxxxxxxxxxxxxxxxxx>

>  Hi Saad,
>
> I would also expect memory problems at the OS level. Did you check the
> Linux messages log (/var/log/messages)? You mention that the users were
> running pipelines - if the same box that runs the database also runs heavy
> user processes, it might have run out of memory which will activate the
> OOM-killer in the kernel. This will try and identify the least important
> process, which usually boils down to an oracle background process that does
> not seem to do a lot, and kill it in order to resolve the Out Of Memory
> situation. There will be a trace in the messages file of this.
>
> The requirements in the installation doco should be used as a very minimal
> guideline only.. you must tune them to your situation.
>
> Cheers,
> Tony
>
>
> Saad Khan wrote:
>
> I ran sysctl -A myself as root (thanks to sysadmin's short memory for not
> revoking my access), and then compared the results with kernel prerequisites
> for the installation.  All the values are above the minimum settings
> required. So I think we can opt out this as a possible cause.
>
>
> I also ran the HCVE script as per metalink note  Note 
> 250262.1<https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=250262.1&blackframe=1>
>
> No solution yet! :(
>
>
> On Thu, Apr 16, 2009 at 9:58 AM, Joey D'Antoni <jdanton1@xxxxxxxxx> wrote:
>
>>  Could you have your sysadmin do a sysctl -A? I suspect some of the
>> needed kernel settings related to Oracle may not be set properly.
>>
>>  ------------------------------
>> *From:* Saad Khan <saad4u@xxxxxxxxx>
>> *To:* oracle-l@xxxxxxxxxxxxx
>> *Sent:* Thursday, April 16, 2009 9:49:58 AM
>> *Subject:* Re: Process died -- no info in trace files
>>
>> Now, we have seen this error in the production box as well. Earlier it was
>> at QA
>>
>>
>> *Process m000 died, see its trace file
>>
>> ksvcreate: Process(m000) creation failed*
>>
>>
>> Just thinking out loud, if its something related to OS,  how can it be hit
>> in two different boxes at almost the save time? Can this be a bug? I'm just
>> getting stumped. Can someone plz help me?
>>
>> Now in trach=e
>> On Wed, Apr 15, 2009 at 9:30 PM, Jack van Zanen <jack@xxxxxxxxxxxx>wrote:
>>
>>> metalink doc *790397.1*
>>>
>>>
>>> has similar errors but for different processes. Could the underlying
>>> cause be the same .
>>>
>>> Cause This is caused by lack of OS configuration, where more memory is
>>> required as OS reached the limits set.
>>>
>>>
>>> Jack
>>>
>>>  2009/4/16 Saad Khan <saad4u@xxxxxxxxx>
>>>
>>>>
>>>> Sorry, I was looking the trace files in the bdump directory.
>>>>
>>>> When I checked the traces at udump, I found following in some of them:
>>>> *
>>>> Process P003 is dead (pid=25576, state=3):
>>>> kxfpg1srv
>>>>         could not start local P003
>>>> *** 2009-04-15 14:03:56.381
>>>> Process P003 is dead (pid=25580, state=3):
>>>> kxfpg1srv
>>>>         could not start local P003
>>>> *** 2009-04-15 14:03:57.384
>>>> Process P003 is dead (pid=25582, state=3):
>>>> kxfpg1srv
>>>>         could not start local P003
>>>> *** 2009-04-15 14:03:58.387
>>>> Process P003 is dead (pid=25584, state=3):
>>>> kxfpg1srv
>>>>         could not start local P003
>>>> *** 2009-04-15 14:03:59.417
>>>> Process P003 is dead (pid=25586, state=3):
>>>> kxfpg1srv
>>>>         could not start local P003*
>>>>
>>>>
>>>>
>>>> Does this ring a bell?
>>>>
>>>>
>>>> On Wed, Apr 15, 2009 at 3:20 PM, Stephane Faroult <
>>>> sfaroult@xxxxxxxxxxxx> wrote:
>>>>
>>>>> The wording of your post ("I really dont see anything in the trace
>>>>> file") makes me think that you are looking in the alert file or
>>>>> something similar. You should look for .trc files under the directory
>>>>> defined as "user_dump_dest" in you parameter files (cd ../udump from
>>>>> the
>>>>> directory where you alert file is located should take you to the right
>>>>> place).
>>>>>
>>>>> HTH
>>>>>
>>>>> S Faroult
>>>>>
>>>>> Saad Khan wrote:
>>>>> > Hi fellows,
>>>>> >
>>>>> > I've oracle 10g (10.2.0.4) running at Linux with partitioning option.
>>>>> > The users were running pipelines while I was informed that the they
>>>>> > got crashed. When I checked the alert log file, I could see the
>>>>> > following error messages:
>>>>> >
>>>>> >
>>>>> > /Wed Apr 15 14:03:54 2009
>>>>> > Process P003 died, see its trace file
>>>>> > Wed Apr 15 14:03:55 2009
>>>>> > Process P004 died, see its trace file
>>>>> > Wed Apr 15 14:03:56 2009
>>>>> > Process P003 died, see its trace file
>>>>> > Process P003 died, see its trace file
>>>>> > Process P003 died, see its trace file
>>>>> > Process P003 died, see its trace file
>>>>> > Wed Apr 15 14:04:02 2009
>>>>> > Process P005 died, see its trace file
>>>>> > Process P005 died, see its trace file
>>>>> >
>>>>> >
>>>>> >
>>>>> > /Now, the wierd thing is, I really dont see anything in the trace
>>>>> file
>>>>> > that could point anything that could have caused this.
>>>>> >
>>>>> > I checked my parameters and found that the PROCESSES parameter was
>>>>> set
>>>>> > to a very low value (i.e.150). Now I've increased it to 400 but this
>>>>> > is just a shot in dark. I'm totally unsure if this could be the
>>>>> reason.
>>>>> >
>>>>> > Can anyone please help me? Its quite urgent.
>>>>> >
>>>>> > Thanks,
>>>>> >
>>>>> >
>>>>> > Khan.
>>>>> >
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Jack van Zanen
>>>
>>> -------------------------
>>> This e-mail and any attachments may contain confidential material for the
>>> sole use of the intended recipient. If you are not the intended recipient,
>>> please be aware that any disclosure, copying, distribution or use of this
>>> e-mail or any attachment is prohibited. If you have received this e-mail in
>>> error, please contact the sender and delete all copies.
>>> Thank you for your cooperation
>>>
>>
>>
>>
>


-- 
Howard A. Latham

Other related posts: