RE: DataGuard

  • From: "Borrill, Christopher" <Chris.Borrill@xxxxxx>
  • To: <oracle-l@xxxxxxxxxxxxx>
  • Date: Tue, 16 Jan 2007 14:20:55 +1300

I have been following this discussion with interest, as I know our
customer is concidering a solution similar to this for a new database
and application we are about to write, but I have one question.  
 
How do you optimise an application for minimum redo generation?
 
thankyou,
Chris Borrill

________________________________

From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Carel-Jan Engel
Sent: Tuesday, 16 January 2007 1:23 p.m.
To: Brandon.Allen@xxxxxxxxxxx
Cc: rgoulet@xxxxxxxxxx; oracle-l@xxxxxxxxxxxxx
Subject: RE: DataGuard


Brandon,

If the system is important for the business, they (the business)  should
provide enough (budget for enough) bandwidth. If they can afford the
money for an extra server, with extra Oracle EE licenses, having a
proper line with enough bw. shouldn't be a very difficult business case.


If you follow the suggestion to use the local standby as a buffer for
redo forwarding, be aware that an unknown amount of redo is not sent to
the DR site at any given point in time. If at that point the disaster
strikes, you will loose transactions. If the business is aware of that
risk, and made the trade-off, fine! Let them confirm that to you in
writing. Too often they are unaware, and the technicians get blamed in
the event of a failover to the DR-site, loosing important transactions.
Just because the techies were responible for the HA solution. Wrong.
Management doesn't make housekeeping responsible for insuring the
building either.

Consider using hardware line cards to compress the redo traffic. Use QOS
on the routers to prioritize the data sent on the portnumber you choose
for redo transport. People might come up with the suggestion of using
ssh tunneling with compression. IMHO: Too cumbersome for HA. You need an
extra process, it will consume CPU, it can fail, needs monitoring, etc.
James Morle used Cisco line cards at a DG site we set up and they got
typically 4:1 compression. They were appr. $1000 each. No setup, no
worries, no monitoring.

Finally: is the application optimized for minimized redo generation?


Best regards,

Carel-Jan Engel

===
If you think education is expensive, try ignorance. (Derek Bok)
===     
On Mon, 2007-01-15 at 16:48 -0700, Allen, Brandon wrote: 

        I'm not questioning Carel-Jan's recommendation at all - I think
he has more DG knowledge in his pinky than I'll ever have, but just
passing on a case where cascading setup might be appropriate/necessary: 

        

        We have been struggling to get a standard (single standby) DG
setup working for the last few months because our network connection
isn't sufficient to keep up with the rate of our redo generation and
when the transfer of archived logs falls behind far enough, it
eventually freezes the production database.  We're using ARCH to
transfer the logs and already tried upgrading to 9.2.0.8 and setting the
hidden parameter _log_archive_callout='LOCAL_FIRST=true', but we still
see this behavior.  Oracle Support's recommendation is to implement a
cascading standby where we ship the logs to a local standby first and
then go from the local to the remote so that the local standby operates
as a buffer to keep the slow network from halting our primary.  We are
considering their recommendation, but going to try everything else we
can think of to avoid it first, which will probably include upgrading to
10.2 because supposedly this problem no longer occurs in 10g, but that's
the same thing we were told about 9.2.0.7 with the local_first=true
setting (Metalink 260040.1) and we're not very confident based on our
experience with that config. 

        

        Of course another option that we're considering is increasing
the network bandwidth to the remote destination, but we would really
like to have dataguard configured such that it will absolutely never
impair production performance because even with the increased bandwidth,
there is always the possibility of WAN problems, someone accidentally
clogging the pipe with other large files, etc. 

        

        Carel-Jan, if you have any recommendations, we'd love to hear
them! 

        

        Thanks, 

        Brandon 

        


________________________________


        From: oracle-l-bounce@xxxxxxxxxxxxx
[mailto:oracle-l-bounce@xxxxxxxxxxxxx] On Behalf Of Carel-Jan Engel
        Sent: Monday, January 15, 2007 4:12 PM
        
        


        Stay away from cascaded redo transport as far as you can. You
simply don't want that in HA environments. As Miracleas.dk states:
'..Complexity is the enemy of availability...' (and I'd like to add: '..
but the friend of consultancy...' J). Imagine a switchover with cascaded
transport. The whole redo transport stack has to be reinvented. A
star-configuration (primary points to both standbys) is much easier to
setup. 
        
        Privileged/Confidential Information may be contained in this
message or attachments hereto. Please advise immediately if you or your
employer do not consent to Internet email for messages of this kind.
Opinions, conclusions and other information in this message that do not
relate to the official business of this company shall be understood as
neither given nor endorsed
        


Other related posts: