Re: data guard fast start failover

From: Alex Gorbachev <ag@xxxxxxxxxxxx>
To: fairlie_r@xxxxxxxxx
Date: Tue, 20 Jan 2009 11:26:31 +1100

There are two issues - one is WebLogic specific as they have their ownconnection management with multi-pools for DataGuard. (don't ask -they are working on integration with FAN and RLB but that's notavailable yet).

The 2nd issue is generic - and with introduction of OracleClusterware, Oracle solved is with VIP's. The problem is that when IPis not available, the connection times out after a while. This is whyVIP's are taken over by survived nodes in RAC but I don't need toexplain that to you. However, Data Guard standby does not take overVIP's when it's promoted to primary. This means that applicationconnection to VIP's of old primary (now unavailable if site are downor hosts a down) will take a while to timeout. If client side LoadBalancing is ON between standby and primary address_list's (in rarecases when there is not real DR and people switch between sitesregularly) then about 50% of connection requests will timeout after aminute or two whatever your tcp_timeout setting in apps tier. If youconfigure your descriptor without load balance option between primaryand standby address lists but only with failover then 100% of re-connects will be delayed.


Fairlie, please correct what I've got wrong here.

Cheers,
Alex

On 20/01/2009, at 9:43 AM, fairlie rego wrote:

You have a connection to the each node in RAC but how you handleconnections to standby?
Alex,
In the environment I am currently working on (2 8 node clusters inDG config) we have both the primary and standby clusters nodevirtual IPs in the tnsnames.ora (16 nodes) .
The application connects to RAC services which run only on theprimary cluster. Upon switchover/failure the db_role_change triggerfires which starts the services on the standby nodes. Ofcourse it isa pain that dbms_service does not update the OCR but let me notdigress....
Am just curious as to why this may not work for you

Thanks

Fairlie Rego
Senior Oracle Consultant
http://el-caro.blogspot.com/
M: +61 402 792 405



--- On Mon, 19/1/09, Alex Gorbachev <ag@xxxxxxxxxxxx> wrote:

From: Alex Gorbachev <ag@xxxxxxxxxxxx>
Subject: Re: data guard fast start failover
To: "Mark Strickland" <strickland.mark@xxxxxxxxx>
Cc: Laimutis.Nedzinskas@xxxxxx, oracle-l@xxxxxxxxxxxxx
Received: Monday, 19 January, 2009, 9:58 AM

Thanks Mark,
What about Data Guard now? You have a connection to the each node inRAC but how you handle connections to standby?On one project I'm working on now, with RAC on primary and RAC onstandby, we plan to setup multi-pool controlling underlying poolsfor each instance on primary *AND* standby. Theoretically, WebLogicmulti-pool with load balancing will not send transactions to the"broken" pools but in the past we didn't have good experience withthat.Another issue is the failover time - VIP's are not taken over bystandby on role switch and, of course, connection timeout takes longtime so if it's 60 seconds for you, is your OS setting fortcp_timeout 60 seconds?
Anybody attempted to do automation of VIP management integrating itwith Observer and FSFO?
Cheers,
Alex

On 19/01/2009, at 9:17 AM, Mark Strickland wrote:
I'll find out more from our WebLogic SME, but we're using WebLogicmulti-pools (multi-datasources?), ie each server running WebLogichas three connection pools -- one for each of the RAC instances.The connections do re-connect automatically after failover. We'refinding that it takes 60-90 seconds for failover and reconnect. Ibelieve that we are using WebLogic XA transactions but I'll verify.
-Mark
On Sun, Jan 18, 2009 at 1:49 PM, Alex Gorbachev <ag@xxxxxxxxxxxx>wrote:
Hi Mark,

Could you elaborate on WebLogic config you are using for RAC?
- Is it configured using WebLogic multi-datasources?
- Do you use WebLogic XA transactions? Does WebLogic datasource re-tries transaction on reconnect?- What are the patched you mentioned (perhaps, you have thereference to the WebLogic support docs)?
Cheers,
Alex

On 17/01/2009, at 8:52 AM, Mark Strickland wrote:
We've been testing FSF with 10.2.0.2 and my co-DBA discovered abug that can cause a split-brain to occur. I don't remember theexact circumstances, but the fix is in 10.2.0.4 which is drivingus to apply that patchset. Our FSF testing with 10.2.0.4 has beengoing very well. If you use WebLogic, it will handle a failoverbut it requires a patch depending on what version you use. I'vebeen doing new 10.2.0.4 builds with RAC and Data Guard with FSFfor a new customer. No issues so far.
Mark
Seattle, WA
On Thu, Jan 15, 2009 at 11:27 PM, <Laimutis.Nedzinskas@xxxxxx>wrote:
Hi all

Anyone's using data guard fast-start failover ?
What are the experiences ?
What about split brain?
Does it interfere heavily with normal database activities?
Any other comments?

Thank you in advance,

Laimis N

--
//www.freelists.org/webpage/oracle-l
Stay connected to the people that matter most with a smarter inbox.Take a look.

References:
- Re: data guard fast start failover
  - From: fairlie rego

Re: data guard fast start failover

Other related posts: