RE: Exchange Server Redundancy

  • From: "Mulnick, Al" <Al.Mulnick@xxxxxxxxxx>
  • To: "'[ExchangeList]'" <exchangelist@xxxxxxxxxxxxx>
  • Date: Tue, 9 Mar 2004 12:02:24 -0500

Paul, there are two types of clustering, shared and shared nothing.  While
this is really a standby server solution (a.k.a shared nothing in a liberal
sense), so is MCS in many instances (active/passive).  
The speed of failover is usually represented by something such as (event of
failure + time to realize it failed + time to bring up the application on
the other node/send the poison packet).  Most of the time is spent realizing
there's a failure (heartbeat interval and then retry interval), and on
bringing the application up on the other node.  This is the part that
baffles me with any solution so far mentioned.  In Exchange, we have grown
accustomed to the two-phase commit database functions.  When you bring a
database down, you need a way to ensure that the data is consistent.  If you
bring it down clean, you  have a reasonable expectation that it got checked
on the way down and it's consistent (although there are some checks on the
way, right?)  If you can't bring it down clean, as in a hardware failure,
you need to ensure that the data is consistent (the dreaded -1018 is what
you want to avoid :).  That's done with checkpoint files, log files, and
various routines that ensure data integrity.  
I can't see how a solution with inherent replication latency can work around
those laws of database integrity (I call them laws, but that's me) to speed
up the resumption of service.  If it can't do that, then the only thing it
can offer me is a cheaper solution for geo-clustering since I won't need the
hardware typically involved in today's solutions.  I can't see how it's
cheaper than MCS in the same datacenter.  I can see it being more risky to
my infrastructure, but again, I'm trying to learn if there is something I'm


From: paul_lemonidis@xxxxxxxxxxx [mailto:paul_lemonidis@xxxxxxxxxxx] 
Sent: Tuesday, March 09, 2004 11:50 AM
To: [ExchangeList]
Subject: [exchangelist] RE: Exchange Server Redundancy

Hi Al
There seems to be something missing here unless I am mistaken. A cluster
only has one set of shared drives. Hence why you use multiple RAID drives or
arrays with multiple controllers. Thus there are no replication issues etc.
This is also why failover can be so fast as the secondary machine/node (one
is the primary node, one the secondary as you probably already know) is
already on-line and using the same disk drives.
The software solution is not clustering as it does not use shared drives
between the two nodes. It is a co-standby server solution and this is not
the same thing at all. Instead the two nodes have independent drives that
then must be replicated in real time. This is a big overhead and hence a
poor mans solution at best. Also, of course, there can be well be
replication issues as you correctly point out.
Paul Lemonidis.

----- Original Message ----- 
From: Mulnick,  <mailto:Al.Mulnick@xxxxxxxxxx> Al 
To: [ExchangeList] <mailto:exchangelist@xxxxxxxxxxxxx>  
Sent: Tuesday, March 09, 2004 4:36 PM
Subject: [exchangelist] RE: Exchange Server Redundancy <> 

In a roundabout way, that's what I'm trying to get to.  I realize there are
hardware solutions that do the same; they replicate writes (really they
bifurcate the write to disk) so you can have geoclustering solutions.  But
I'm trying to figure out how these bright programmers figured out a way to
protect the application data and provide a six second failover.  I'm
concerned that such a solution would be a "poor man's" cluster at best, and
a data integrity nightmare at worst.  I don't see how the fast failover
claim can work with the application nor how it is better than the MCS
solution offered by the vendor of the application (concern for the
third-party support comes into play here), but I have an open mind and if
progress has been made, I'd like to educate myself on it.
So far I don't see how the solution could be better, but I'm certainly
interested to hear.


From: paul_lemonidis@xxxxxxxxxxx [mailto:paul_lemonidis@xxxxxxxxxxx] 
Sent: Tuesday, March 09, 2004 11:21 AM
To: [ExchangeList]
Subject: [exchangelist] RE: Exchange Server Redundancy

Hi all
Sorry if I am missing something here but since when is a pure software
solution that replicates an entire drive going to offer perforamnce anywhere
near that of a cluster using shared drives. This seems nothing more than a
co-standby server solution like say Vinca? Rather than a single shared drive
it runs huge amounts of replciation between dupliacte drives on duplicate
servers. I can actually see you paying more for an inferior solution from
what I have seen so far.
Hardware clustering is far more resilient if done correctly but it does come
at a price, of course. At the end of the day you get what you pay for.
Paul Lemonidis.

----- Original Message ----- 
From: Mulnick,  <mailto:Al.Mulnick@xxxxxxxxxx> Al 
To: [ExchangeList] <mailto:exchangelist@xxxxxxxxxxxxx>  
Sent: Tuesday, March 09, 2004 3:23 PM
Subject: [exchangelist] RE: Exchange Server Redundancy <> 

I never considered MCS to be more difficult than adding a third-party app.
Is that all it does?  How does it make the recovery so fast?  How does it
check for db consistency?  


From: Tiago de Aviz [mailto:Tiago@xxxxxxxxxxxxxxx] 
Sent: Tuesday, March 09, 2004 9:17 AM
To: [ExchangeList]
Subject: [exchangelist] RE: Exchange Server Redundancy

It is much simpler because it can be implemented on a single day, it
replicates data on the bit level, it's a cheap software, and if you want,
you can user a slower machine or any other machine for redundancy.


No, while Brightstor is replicating, it doesn't know if the file is a
database or a Star Wars movie. It's all the same for him.


Tiago de Aviz


(41) 340-2363 <> 


Esta mensagem, incluindo seus anexos, tem caráter confidencial e seu
conteúdo é restrito ao destinatário da mensagem. Caso você tenha recebido
esta mensagem por engano, queira por favor retorná-la ao destinatário e
apagá-la de seus arquivos. Qualquer uso não autorizado, replicação ou
disseminação desta mensagem ou parte dela é expressamente proibido. A
SoftSell não é responsável pelo conteúdo ou a veracidade desta informação.


Other related posts: