if it is oom killer I have found a number of tweaks you can try - also try hugemem kernel or even 64bit os 2009/4/17 Tony van Lingen <tony_vanlingen@xxxxxxxxxxxxxxxxxxxxx> > Hi Saad, > > I would also expect memory problems at the OS level. Did you check the > Linux messages log (/var/log/messages)? You mention that the users were > running pipelines - if the same box that runs the database also runs heavy > user processes, it might have run out of memory which will activate the > OOM-killer in the kernel. This will try and identify the least important > process, which usually boils down to an oracle background process that does > not seem to do a lot, and kill it in order to resolve the Out Of Memory > situation. There will be a trace in the messages file of this. > > The requirements in the installation doco should be used as a very minimal > guideline only.. you must tune them to your situation. > > Cheers, > Tony > > > Saad Khan wrote: > > I ran sysctl -A myself as root (thanks to sysadmin's short memory for not > revoking my access), and then compared the results with kernel prerequisites > for the installation. All the values are above the minimum settings > required. So I think we can opt out this as a possible cause. > > > I also ran the HCVE script as per metalink note Note > 250262.1<https://metalink.oracle.com/metalink/plsql/showdoc?db=NOT&id=250262.1&blackframe=1> > > No solution yet! :( > > > On Thu, Apr 16, 2009 at 9:58 AM, Joey D'Antoni <jdanton1@xxxxxxxxx> wrote: > >> Could you have your sysadmin do a sysctl -A? I suspect some of the >> needed kernel settings related to Oracle may not be set properly. >> >> ------------------------------ >> *From:* Saad Khan <saad4u@xxxxxxxxx> >> *To:* oracle-l@xxxxxxxxxxxxx >> *Sent:* Thursday, April 16, 2009 9:49:58 AM >> *Subject:* Re: Process died -- no info in trace files >> >> Now, we have seen this error in the production box as well. Earlier it was >> at QA >> >> >> *Process m000 died, see its trace file >> >> ksvcreate: Process(m000) creation failed* >> >> >> Just thinking out loud, if its something related to OS, how can it be hit >> in two different boxes at almost the save time? Can this be a bug? I'm just >> getting stumped. Can someone plz help me? >> >> Now in trach=e >> On Wed, Apr 15, 2009 at 9:30 PM, Jack van Zanen <jack@xxxxxxxxxxxx>wrote: >> >>> metalink doc *790397.1* >>> >>> >>> has similar errors but for different processes. Could the underlying >>> cause be the same . >>> >>> Cause This is caused by lack of OS configuration, where more memory is >>> required as OS reached the limits set. >>> >>> >>> Jack >>> >>> 2009/4/16 Saad Khan <saad4u@xxxxxxxxx> >>> >>>> >>>> Sorry, I was looking the trace files in the bdump directory. >>>> >>>> When I checked the traces at udump, I found following in some of them: >>>> * >>>> Process P003 is dead (pid=25576, state=3): >>>> kxfpg1srv >>>> could not start local P003 >>>> *** 2009-04-15 14:03:56.381 >>>> Process P003 is dead (pid=25580, state=3): >>>> kxfpg1srv >>>> could not start local P003 >>>> *** 2009-04-15 14:03:57.384 >>>> Process P003 is dead (pid=25582, state=3): >>>> kxfpg1srv >>>> could not start local P003 >>>> *** 2009-04-15 14:03:58.387 >>>> Process P003 is dead (pid=25584, state=3): >>>> kxfpg1srv >>>> could not start local P003 >>>> *** 2009-04-15 14:03:59.417 >>>> Process P003 is dead (pid=25586, state=3): >>>> kxfpg1srv >>>> could not start local P003* >>>> >>>> >>>> >>>> Does this ring a bell? >>>> >>>> >>>> On Wed, Apr 15, 2009 at 3:20 PM, Stephane Faroult < >>>> sfaroult@xxxxxxxxxxxx> wrote: >>>> >>>>> The wording of your post ("I really dont see anything in the trace >>>>> file") makes me think that you are looking in the alert file or >>>>> something similar. You should look for .trc files under the directory >>>>> defined as "user_dump_dest" in you parameter files (cd ../udump from >>>>> the >>>>> directory where you alert file is located should take you to the right >>>>> place). >>>>> >>>>> HTH >>>>> >>>>> S Faroult >>>>> >>>>> Saad Khan wrote: >>>>> > Hi fellows, >>>>> > >>>>> > I've oracle 10g (10.2.0.4) running at Linux with partitioning option. >>>>> > The users were running pipelines while I was informed that the they >>>>> > got crashed. When I checked the alert log file, I could see the >>>>> > following error messages: >>>>> > >>>>> > >>>>> > /Wed Apr 15 14:03:54 2009 >>>>> > Process P003 died, see its trace file >>>>> > Wed Apr 15 14:03:55 2009 >>>>> > Process P004 died, see its trace file >>>>> > Wed Apr 15 14:03:56 2009 >>>>> > Process P003 died, see its trace file >>>>> > Process P003 died, see its trace file >>>>> > Process P003 died, see its trace file >>>>> > Process P003 died, see its trace file >>>>> > Wed Apr 15 14:04:02 2009 >>>>> > Process P005 died, see its trace file >>>>> > Process P005 died, see its trace file >>>>> > >>>>> > >>>>> > >>>>> > /Now, the wierd thing is, I really dont see anything in the trace >>>>> file >>>>> > that could point anything that could have caused this. >>>>> > >>>>> > I checked my parameters and found that the PROCESSES parameter was >>>>> set >>>>> > to a very low value (i.e.150). Now I've increased it to 400 but this >>>>> > is just a shot in dark. I'm totally unsure if this could be the >>>>> reason. >>>>> > >>>>> > Can anyone please help me? Its quite urgent. >>>>> > >>>>> > Thanks, >>>>> > >>>>> > >>>>> > Khan. >>>>> > >>>>> >>>>> >>>>> >>>> >>> >>> >>> -- >>> Jack van Zanen >>> >>> ------------------------- >>> This e-mail and any attachments may contain confidential material for the >>> sole use of the intended recipient. If you are not the intended recipient, >>> please be aware that any disclosure, copying, distribution or use of this >>> e-mail or any attachment is prohibited. If you have received this e-mail in >>> error, please contact the sender and delete all copies. >>> Thank you for your cooperation >>> >> >> >> > -- Howard A. Latham