Re: data guard fast start failover

  • From: Alex Gorbachev <ag@xxxxxxxxxxxx>
  • To: Laimutis.Nedzinskas@xxxxxx
  • Date: Tue, 20 Jan 2009 11:18:59 +1100

True. Observer must be Highly available as well as standby if there is any maintenance or glitch detected and one becomes unavailable, FSFO must be disable to avoid false positive in case the 2nd failure happens while investigation is ongoing. Many people say it's double failure and we don't count on that but it does happen. :)


Cheers,
Alex

On 19/01/2009, at 11:43 PM, Laimutis.Nedzinskas@xxxxxx wrote:

from Alex Gorbachev <ag@xxxxxxxxxxxx>
Observer and standby couldn't distinguish network
connectivity issue on primary from any primary host failure, power
outage or whatever else.


Exactly! But what I did was simple: killed both observer and standby.
Then primary killed itself. Which is a part of a split brain prevention. But in this case it means that one must make sure that at least one standby or observer(or a second observer) are allways alive, running and accesible
to the primary.
Else one can get an unexpected shutdown of primary.

Brgds, Laimis






            Alex Gorbachev
            <ag@xxxxxxxxxxxx>
                                                                       To
            2009.01.19 13:33          Laimutis.Nedzinskas@xxxxxx
                                                                       cc
                                      ORACLE-L Freelists
                                      <oracle-l@xxxxxxxxxxxxx>
Subject Re: data guard fast start failover










Laimutis,

As Ian explained, if your current primary looses connectivity to both
Observer and standby, it *must* shutdown to avoid situation with two
primary databases active and diverging. Indeed, if Observer and
Standby are loosing connectivity to the primary (that is OK and
operational otherwise) then Observer and standby will make decision to
promote standby database to the primary role - the primary database
has failed for them. Observer and standby couldn't distinguish network
connectivity issue on primary from any primary host failure, power
outage or whatever else.

Having data integrity above all, primary must be stopped to avoid
split-brain - i.e. both databases in primary role. What you call
dangerous must be relatively routine failover assuming DR is designed
and implemented properly and can actually be promoted to primary
safely and hold the traffic.

Cheers,
Alex

On 19/01/2009, at 6:40 PM, Laimutis.Nedzinskas@xxxxxx wrote:

In my case primary killed itself because it lost communication with
BOTH
observer and standby. Then primary thinks that since FSF is enabled
then
observer and/or standby would attempt to failover.
It has a sense but it's kind of dangerous.




Other related posts: