CRS stuff (part 2)

  • From: "Henry Poras" <henry@xxxxxxxxxxxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Wed, 23 Nov 2005 09:54:19 -0500

.here's some more...

So a flowchart of this mess would show:

ON BOOT
If crsstart = disable, then cssrun becomes 'norun'.
All of the respawn commands in inittab still run, but get stuck in the 
startcheck loop (which will never succeed).

DISABLE RESPAWN, KILL A PROCESS (not cssd)
Cssrun = norun. 
On respawn, startcheck will yield an exit status of 3. Nothing respawns.

ON BOOT, NORMAL START
The three shell scripts called from inittab all need init.crs start to run.
This will put the boottime into cssrun. This is necessary for startcheck to
succeed. 
Now,
init.evmd run will run $CRS_HOME/bin/evmd run
init.crsd run will run $CRS_HOME/bin/crsd -1 & (for the first run after
reboot)
and also $CRS_HOME/bin/crsd run
init.cssd fatal calls init.cssd daemon & which calls $CRS_HOME/bin ocssd

WHAT HAPPENS WHEN PROCESSES DIE?
kill init.evmd run     evmd run remains
     init.evmd run respawns generating a new evmd run (duplicate?)
kill evmd run          init.evmd run is killed
     init.evmd run respawns generating a new evmd run

kill init.crsd run     crsd -1 & remains (& used as identifier)
                       crsd run remains
     init.crsd run respawns generating a new crsd run
     (this isn't the first run after a reboot so crsd -1 isn't duplicated)
kill crsd run          init.crsd run is killed
                       crsd -1 & remains
     init.crsd run respawns generating a new crsd run
kill crsd -1 &         init.crsd run remains
                       crsd run remains
     crsd -1 & is gone for good

kill init.cssd fatal   init.cssd daemon & remains
                       ocssd remains
     init.cssd fatal respawns which generates a new init.cssd daemon & 
                              which generates a new ocssd. These can't
                              duplicate, so the new ocssd fails, which
                              goes to reboot.
kill init.cssd daemon  ocssd remains
                       init.cssd fatal remains
     init.cssd fatal (infinite loop) detects init.cssd daemon is gone.
                     This generates a new init.cssd daemon which generates
                     a new ocssd, which fails and the server reboots.
kill ocssd           The init.cssd daemon script continues to reboot.


OPEN QUESTIONS
A lot of these scenarios still need to be tested experimentally. Are they
accurate? 
How can you exit the loop if startcheck doesn't succeed? Having startcheck
stuck in a loop is how init.crs works when run manually. Changing cssrun
allows startcheck to succeed.
Is crsd -1 & really the one process which can be terminated? What does it
do?
After reboot initiated from termination of a css process, the cssrun file is
always 'norun' so startcheck will always fail and manual intervention will
be necessary. True? No. After reboot initiated from termination of a css
process the server should start. Even though cssrun is 'norun' on the
shutdown, init.crs start just looks at the crsstart on boot. If this is
configured for autostart (enable), then boottime is put into cssrun and
everything should work.

Henry


Other related posts:

  • » CRS stuff (part 2)