Re: Moving DR site from 30miles to 1600miles

Dnia środa, 9 kwietnia 2008 17:39, Ravi Gaur napisał:
> Hello all,
>
> We are planning to move our DR site which is currently about 30 miles from
> production site to ~1600 miles away. We currently have a 4-node RAC setup
> on our production site that houses 3 production instances (all 10.2.0.3 on
> Solaris 10). The SAN is Storagetek and we use ASM for volume management. In
> our testing, we are hitting issues in network transfer rates to the
> 1600-miles site -- a simple "scp" of 1GB file takes about 21 minutes.  We
> generate archives at the rate of approx 1GB/8minutes. The network folks
> tell me that the TCP setting is a constraint here (currently set to 64k
> window-size which Sysadmins here say is the max setting). We have an Oc3
> link that can transfer @ 150Mbps (that is what the networking team tells
> me).
>
> I've an SR open w/ Oracle and have also gone thru few Metalink notes that
> talk about optimizing the network from dataguard perspective. One of the
> notes I came across also talks about cascaded standby dataguard setup (one
> standby local pushes logs to the remote site).
>
> I'm trying to collect ideas how others are doing it under similar scenarios
> and if there is something we can do to utilize the entire network bandwidth
> that we have available to us.
>

You must tune your TCP/IP network stack for WAN (increase TCP window size and 
so on). What's yours latency to that DR site? (this is very important for 
TCP) You could try playing with those parameters:

# bursty WAN traffic
ndd -set /dev/tcp tcp_deferred_acks_max 8
ndd -set /dev/tcp tcp_deferred_ack_interval 500

# max buffer which app can request using setsockopt(), use with care!
ndd -set /dev/tcp tcp_max_buf 83886080
ndd -set /dev/tcp tcp_cwnd_max 83886080

AFAIK 64k is not the limit (!!) with Solaris 10, you could go much higher than 
that - 1GB is the limit?

# window size
ndd -set /dev/tcp tcp_xmit_hiwat <ws>
ndd -set /dev/tcp tcp_recv_hiwat <ws>

You can estimate needed window size by calculating: 
bandwidth (bytes/s!) * latency (in secs) = <winsize>

So for 155 Mbps (OC3) and 100ms latency (you have to measure that by own or 
ask network/sysadmin guys!) it's:
(155*1024*1024/8) bytes/sec * 0.1 second = ~ near 2MB (for 5 MB you would set 
it to 8 MB and so on)

Also you can set TCP window size per IP route using route(1) (IMHO very good 
way to do this). I'm not sure which systems in RAC+DG combo perform sending 
to the other side, but you should tune both sides (in 1->1 DG case that would 
trival, in RAC you would have tune each one I think - you need to double 
check that). 

p.s. Be sure to check what is Oracle trying to use for setsockopt() SO_SNDBUF 
and SO_RCVBUF. Possible solutions are to use DTrace or pfiles (easier) on  
Oracle process sending or reciving the data depending on the side (it will 
show you fd/socket with params).

p.s.#2 Test and benchmark before making any changes on production!

p.s.#3 If in doubt, use snoop or better tcpdump ;)

-- 
Jakub Wartak
http://vnull.pcnet.com.pl
--
http://www.freelists.org/webpage/oracle-l


Other related posts: