I think I've figured out what had happened here.
Once again, the message in tmon.trc was:
TMON (ospid: 14954): terminating the instance due to error 472
The error 472 actually means that the PMON died (This information is the key -
I should have looked up the error message before!):
oerr ora 472
00472, 00000, "PMON process terminated with error"
// *Cause: The process cleanup process died
TMON just happened to be the first one who noticed that PMON had disappeared
and requested the abnormal instance termination. The problem can be reproduced
by shooting down the PMON (kill -9 ). Even the cleanup stack looks identical.
Finally, the PMON was killed by a dodgy application (a problem similar to
http://nenadnoveljic.com/blog/avaloq-database-crash/ ;)
Thanks to all who provided useful pieces of information, challenged my
reasoning and by doing so nudged me to revisit the problem over and over!
Nenad
Twitter: @NenadNoveljic
Home page: http://nenadnoveljic.com
-----Original Message-----
From: oracle-l-bounce@xxxxxxxxxxxxx [mailto:oracle-l-bounce@xxxxxxxxxxxxx] On ;
Behalf Of Noveljic Nenad
Sent: Dienstag, 19. September 2017 20:51
To: 'Yong Huang'; oracle-l@xxxxxxxxxxxxx
Subject: RE: tmon
Hey Yong,
Thank you for providing your much appreciated insight on the functions on the
cleanup stack.
I've already opened an SR and will keep you posted.
Besides that, I keep running kill.d in the background to record any
interference by the application ( such as in
http://nenadnoveljic.com/blog/avaloq-database-crash/ ;) in the case of a new
occurrence.
Nenad
Twitter: @NenadNoveljic
Home page: http://nenadnoveljic.com
-----Original Message-----
From: Yong Huang [mailto:yong321@xxxxxxxxx]
Sent: Dienstag, 19. September 2017 17:09
To: oracle-l@xxxxxxxxxxxxx
Cc: Noveljic Nenad
Subject: Re: tmon
Hi Nenad,
I searched for your call stack on MOS but can't locate a good match. There are
a few for Oracle 11gR2 but there's an additional function between ksuitm and
ksumcl, i.e. ksuinstalive, e.g. Bugs 16426985, 10179554. One is for Oracle 12.1
(Bug 18077020) and there's no such intermediate function, but there's
ktsj_smco_main further down the stack. Anyway, I suggest you open an SR and
have the Support take a look.
Your stack:
ksedsts: error handling, always ignore
kjzdicrshnfy: crash notification
ksuitm: some kind of timeout
ksumcl: not sure, process memory cleanup?
ksbcti: "call timeout/interrupts" according to various bug reports
ksbabs: "Background process: Action based server"
ksbrdp: "run a detached (background) process"
ksumcl may be interesting. But if it's already doing cleanup, then this stack
is not helpful.
If Oracle Support finds anything interesting, let us know. Thanks.
Yong Huang
____________________________________________________
Please consider the environment before printing this e-mail.
Bitte denken Sie an die Umwelt, bevor Sie dieses E-Mail drucken.
Important Notice
This message is intended only for the individual named. It may contain
confidential or privileged information. If you are not the named addressee you
should in particular not disseminate, distribute, modify or copy this e-mail.
Please notify the sender immediately by e-mail, if you have received this
message by mistake and delete it from your system.
E-mail transmission may not be secure or error-free as information could be
intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also
processing of incoming e-mails cannot be guaranteed. All liability of the
Vontobel Group and its affiliates for any damages resulting from e-mail use is
excluded. You are advised that urgent and time sensitive messages should not be
sent by e-mail and if verification is required please request a printed version.
!���
0~���+-����
������rW�
Important Notice
This message is intended only for the individual named. It may contain
confidential or privileged information. If you are not the named addressee you
should in particular not disseminate, distribute, modify or copy this e-mail.
Please notify the sender immediately by e-mail, if you have received this
message by mistake and delete it from your system.
E-mail transmission may not be secure or error-free as information could be
intercepted, corrupted, lost, destroyed, arrive late or incomplete. Also
processing of incoming e-mails cannot be guaranteed. All liability of the
Vontobel Group and its affiliates for any damages resulting from e-mail use is
excluded. You are advised that urgent and time sensitive messages should not be
sent by e-mail and if verification is required please request a printed version.
��i��0���zX���+��n��{�+i�^