RE: Some Dataguard is good, lots more must be better? (specifically, when do most actual failovers really occur?)

  • From: Carel-Jan Engel <cjpengel.dbalert@xxxxxxxxx>
  • To: mwf@xxxxxxxx
  • Date: Wed, 20 Sep 2006 22:14:25 +0200

On Wed, 2006-09-20 at 05:46 -0400, Mark W. Farnham wrote:

> <snip>
> <snip>
> Hmm. Certainly you do fail over if primary is dead. But my experience is
> that many more failovers are scheduled for preventive maintenance on the
> normally primary box.

Yes, completely. That's why the majority of my CTs set up a local and a
remote standby. The local standby is useful for the 'maintenance'
switchover whithout too much side effects of reduced bandwith, DNS
changes for the outer world, and so on. The remote is there to secure
the data. 

>  I think it is healthy to regularly failover and back
> on a regular (but relatively infrequent) schedule when a known duration
> blimp can be tolerated. Further, this engages your entire staff in making
> routine transparent network re-routing and all the other issues to use the
> standby as production (and get back).

My best site ever was an airport, where they use to do a switchover on a
regular basis. Every last Sunday of the month they switched between the
Tower DC and the Terminal DC. 3-4 times a year they found out that some
interface had a flaw. Imagine what would have happened if the first
failover was performed after 2 or 3 years and would result in 6-12
flawing interfaces. BTW, Kevin, storage replication wouldn't have
prevented this in all cases. It was due to missing links of new
interfaces (the cabling was simply not there) and alike as well as
flawing change procedures ( a file was altered in just one site ISO 2).
'Business' got very confidential about the setup, and asking for a
switchover to be able to upgrade hardware or so was granted even at day
times, as long as we found a 15 minute window where no plane was
departing / landing. (This was dispatching / flight information/check
in/apron control only, no flight control)

> I'm curious what others see in the field: Is fail over routine or emergency
> only? Do you have trouble getting back?

Getting back seems to be a problem only at sites that do not perform
regular switchovers (say at least once / quarter)
For sites that practice at a regular basis doing a switchover or evan a
failover tends to become a no brainer.

Best regards,

Carel-Jan Engel

If you think education is expensive, try ignorance. (Derek Bok)

Other related posts: