Re: Active Dataguard -- Cascade Standby or not

  • From: Gaja Krishna Vaidyanatha <gajav@xxxxxxxxx>
  • To: Stalin <stalinsk@xxxxxxxxx>
  • Date: Tue, 6 Mar 2012 15:21:38 -0800 (PST)

Hi Stalin,
Late last year you had posed this question on the list and I had responded that 
I will share our experience on this subject, as soon as we completed a customer 
PoC. Here are some of the high-level observations:

1) The most important driving factor at this customer's environment was 
AUTOMATION. We needed to ensure that at any given time, when there was a 
failure in the PRIMARY database, the HA database kicked into action instantly 
(Here I am referring to HA across data centers in a geographically dispersed 
Cloud environment). So yes, we are talking about 2 different non-RAC databases, 
in 2 different data centers when we refer to as PRIMARY and HA.

2) For #1 we needed to use the ADG Broker/Observer on a server independent of 
the database servers as it provided us the required independence from the 
database servers along with the necessary automation. And given that automation 
was paramount to everything, we also needed the "Fast Start Failover" feature 
of ADG. Just looking at this aspect, we could NOT setup ADG in a Cascaded 
Configuration, as the Broker currently does not support this configuration with 
the required automation "bells and whistles" that we needed. A cascaded ADG 
configuration needs to be manually managed (some level of automation can be 
achieved with scripting and job scheduling but it is tricky) and due to time 
and other constraints, we had to opt out of it.

3) Log transport was configured in SYNC mode between the PRIMARY and HA 
databases and in ASYNC mode between the PRIMARY and DR databases.

4) The next aspect relates to the decision made in #1 - #3, is the additional 
overhead on the PRIMARY database server for additional/multiple redo shipping. 
We took the capacity planning angle to this issue, measured the resource 
consumption during redo transport and propagation and ensured that the PRIMARY 
database was configured with enough hardware resources to guarantee the 
performance and SLA of this database even with all of the redo propagation that 
was going on to various HA and DR databases. Where relevant, we implemented 
Database Resource Management (resource profiles, consumer groups etc) to ensure 
"bread and butter" transactions/jobs were processed without any 
performance/elapsed-time blips. This ensured that dynamic workloads such as 
ad-hoc reports did not eat away resources to glory and create an artificial 
resource starvation problem.

5) The network pipe between the PRIMARY and HA data centers was "dark fiber" 
and the inter-data center latency, throughput and physical routing is as good 
as it could be. Even though light travels at a speed of 299,792,458 m/sec, that 
speed is measured in a vacuum, not when it touches physical devices such as 
routers and switches. When it comes to real-life network configurations, "dark 
fiber networks" (like other networks) have to traverse through many physical 
devices, to get from A to B. This impedes the fantastic theoretical speed of 
light. But it is still pretty darn good. The data centers in question here were 
35 km apart, implying light could travel "in a vacuum" in 0.00011674743332 secs 
~0.17 ms between the 2 data centers. But in reality the network latency between 
the data centers was between 1-5 ms. In this configuration, the SYNC mode 
provided "near instantaneous" log transport for the HA database, implying that 
the HA database never had to
 "catch up". The same is not true for the DR databases, but rightfully so, as 
their log transport is ASYNC.
 

Hope this provides some additional insight into this. Please let me know if you 
have any further questions.

Cheers,

Gaja

Gaja Krishna Vaidyanatha,
CEO & Founder, DBPerfMan LLC
http://www.dbperfman.com
http://www.dbcloudman.com

Phone - +1-650-743-6060
http://www.linkedin.com/in/gajakrishnavaidyanathaCo-author:Oracle 
Insights:Tales of the Oak Table 
- http://www.apress.com/book/bookDisplay.html?bID14
Co-author:Oracle Performance Tuning 101 
- http://www.amazon.com/gp/reader/0072131454/ref=sib_dp_pt/102-6130796-4625766
Enabling Cloud Deployment & Management for Oracle Databases


________________________________
 From: Stalin <stalinsk@xxxxxxxxx>
To: oracle-l <oracle-l@xxxxxxxxxxxxx> 
Sent: Wednesday, November 30, 2011 5:24 PM
Subject: Active Dataguard -- Cascade Standby or not
 
We have a requirement from one of our customer to have up to 15 ReadOnly DB
sites all replicating data from so called primary site. Active dataguard
seems to be a perfect fit but I was wondering the impact on the primary
site in replicating data to all 15 readonly Active physical standbys. Only
one standby site will be the failover target and configured in SYNC
Availablity mode and rest in ASYNC Performance mode. Also, i was wondering
if having a cascade standby's instead of having primary site to replicate
all standby's a viable option to reduce load on primary with the trade off
in additional lags from standby.
If anyone could share your experences or things to watch for in similar
setup is greatly appreciated.

-- 
Thanks,

Stalin
PS. 11.2.0.2 RAC (Primary), 11.2.0.2 (Standby, Single Instance), Linux


--
//www.freelists.org/webpage/oracle-l

--
//www.freelists.org/webpage/oracle-l


Other related posts: