I find that "Local Address" of netstat is not reliable and I can reproduce its odd behavior, although I still can't reproduce the Oracle problem. [root@dcprperschdb1b oracle]# export PS1='NodeB# ' <- make prompt shorter NodeB# netstat -anp | grep 1234 <- port is not taken NodeB# ifconfig | grep 10.111.108.167 <- this IP is on this box inet addr:10.111.108.167 Bcast:10.111.108.255 Mask:255.255.255.128 NodeB# cat myserver.pl <- my dummy server/listener will use that IP #!/usr/bin/perl use Socket; $server_port = 1234; socket(Server, PF_INET, SOCK_STREAM, getprotobyname('tcp')); setsockopt(Server, SOL_SOCKET, SO_REUSEADDR, 1); #$my_addr = sockaddr_in($server_port, INADDR_ANY); $my_addr = sockaddr_in($server_port, inet_aton("10.111.108.167")); bind(Server, $my_addr) or die "Can't bind to port $server_port: $!\n"; listen(Server, SOMAXCONN) or die "Can't listen on port $server_port: $!\n"; while (accept(Client, Server)) { } client Server; NodeB# ./myserver.pl & <- run dummy listener on that IP on port 1234 [1] 21092 NodeB# su - oracle dcprperschdb1b ~ $ srvctl config scan -i 3 <- verify 10.111.108.167 is SCAN VIP SCAN name: scan_erschp.mdanderson.edu, Network: 1/10.111.108.128/255.255.255.128/bond0 SCAN VIP name: scan3, IP: /scan_erschp.mdanderson.edu/10.111.108.167 dcprperschdb1b ~ $ srvctl status scan -i 3 <- it runs on node b SCAN VIP scan3 is enabled SCAN VIP scan3 is running on node dcprperschdb1b dcprperschdb1b ~ $ srvctl status scan_listener -i 3 <- also on node b SCAN Listener LISTENER_SCAN3 is enabled SCAN listener LISTENER_SCAN3 is running on node dcprperschdb1b dcprperschdb1b ~ $ srvctl relocate scan -i 3 dcprperschdb1b ~ $ srvctl status scan -i 3 <- changes to node a SCAN VIP scan3 is enabled SCAN VIP scan3 is running on node dcprperschdb1a dcprperschdb1b ~ $ srvctl status scan_listener -i 3 <- also changes to node a SCAN Listener LISTENER_SCAN3 is enabled SCAN listener LISTENER_SCAN3 is running on node dcprperschdb1a The above shows that both SCAN VIP and SCAN listener are relocated together even if the IP is being used by a process (owned by root or oracle, which makes no difference). The following shows the non-existing "Local Address" of `netstat -an': dcprperschdb1b ~ $ exit logout NodeB# ifconfig | grep 10.111.108.167 <- IP no longer on this box NodeB# netstat -anp | grep 10.111.108.167 <- but netstat Local Address still has it! tcp 0 0 10.111.108.167:1234 0.0.0.0:* LISTEN 21092/perl tcp 0 0 10.111.108.160:20909 10.111.108.167:1521 ESTABLISHED 17743/ora_pmon_ersc NodeB# On node a, everything is normal: dcprperschdb1a ~ $ sudo netstat -anp | grep 10.111.108.167 <- node a runs SCAN_LISTENER3 now [sudo] password for oracle: tcp 0 0 10.111.108.167:1521 0.0.0.0:* LISTEN 29943/tnslsnr tcp 0 0 10.111.108.167:53 0.0.0.0:* LISTEN 8328/named tcp 0 0 10.111.108.159:60641 10.111.108.167:1521 ESTABLISHED 26129/ora_pmon_ersc tcp 0 0 10.111.108.167:1521 10.111.108.159:60641 ESTABLISHED 29943/tnslsnr tcp 0 0 10.111.108.167:1521 10.111.108.160:20909 ESTABLISHED 29943/tnslsnr udp 0 0 10.111.108.167:53 0.0.0.0:* 8328/named dcprperschdb1a ~ $ ps -fp 29943 UID PID PPID C STIME TTY TIME CMD oracle 29943 1 0 09:29 ? 00:00:00 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN3 -inherit As expected, from a 3rd box, I can no longer connect to my dummy server (was able to before relocate): $ telnet 10.111.108.167 1234 Trying 10.111.108.167... telnet: connect to address 10.111.108.167: Connection refused Yong Huang ------ Original message ------ 2-node Oracle 11.2.0.1 RAC, RHEL 5.7 x86_64, kernel 2.6.18-274.7.1.el5 One of the 3 SCAN listeners listens on an IP which exists on the other node of this 2-node RAC. C:\>nslookup scancs4.<domainname> ... Name: scancs4.<domainname> Addresses: 10.111.76.85 10.111.76.84 10.111.76.86 dcsrpcora4a ~ $ ifconfig | egrep '10.111.76.84|10.111.76.85|10.111.76.86' inet addr:10.111.76.85 Bcast:10.111.76.127 Mask:255.255.255.128 inet addr:10.111.76.84 Bcast:10.111.76.127 Mask:255.255.255.128 dcsrpcora4b ~ $ ifconfig | egrep '10.111.76.84|10.111.76.85|10.111.76.86' inet addr:10.111.76.86 Bcast:10.111.76.127 Mask:255.255.255.128 The problem is that the 3 IP's, supposedly each backed by one Oracle SCAN listener, do not all have SCAN listeners listening on them. Specifically, 10.111.76.84 on node a has no listener, and on node b there *is* a SCAN listener that claims to be listening on that IP. (Note that the 4th field of `netstat -an' is "Local Address".) dcsrpcora4b ~ $ netstat -anp 2>/dev/null | grep 10.111.76.84 <-- this IP exists on node a tcp 0 0 10.111.76.84:1521 0.0.0.0:* LISTEN 15130/tnslsnr tcp 0 0 10.111.76.70:55578 10.111.76.84:1521 ESTABLISHED 12061/ora_pmon_orac dcsrpcora4b ~ $ ps -fp 15130 UID PID PPID C STIME TTY TIME CMD oracle 15130 1 0 Jan27 ? 00:04:40 /u01/app/11.2.0/grid/bin/tnslsnr LISTENER_SCAN3 -inherit How can a listener process running on its own server (node b) claim to be listening on an IP which is physically located on a different server (node a)? On node a, everything looks normal from the OS perspective, and there actually is a process, named, listening on 10.111.76.84 using port 53. (Not sure why named uses a virtual interface created by Oracle.) [root@dcsrpcora4a ~]# netstat -anp | grep 10.111.76.84 tcp 0 0 10.111.76.84:53 0.0.0.0:* LISTEN 5001/named udp 0 0 10.111.76.84:53 0.0.0.0:* 5001/named We know a SCAN VIP can "float" or relocate between the 2 nodes. But at any give point in time, when netstat says a specific IP is local to a specific host, that IP must be given by that host (as shown by ifconfig), not by a different host, regardless what magic Oracle's SCAN listener software does. Checking with srvctl: dcsrpcora4a ~ $ srvctl status scan -i 3 SCAN VIP scan3 is enabled SCAN VIP scan3 is running on node dcsrpcora4a dcsrpcora4a ~ $ srvctl status scan_listener -i 3 SCAN Listener LISTENER_SCAN3 is enabled SCAN listener LISTENER_SCAN3 is running on node dcsrpcora4b <-- not dcsrpcora4a! On another RAC cluster, I tried 'srvctl relocate scan' and 'srvctl relocate scan_listener'. In both cases, both the SCAN VIP and SCAN listener are relocated *together* to a different node. It's not possible to reproduce relocating one but not the other. I believe to correct the problem we have now, we may just run srvctl relocate either the VIP (...84) or the SCAN listener (LISTENER_SCAN3). But I'd like to find out what caused this situation. Yong Huang -- //www.freelists.org/webpage/oracle-l