Re: [SPAM] RAC Failover testing

  • From: "Anurag Verma" <anuragdba@xxxxxxxxx>
  • To: "Bobak, Mark" <Mark.Bobak@xxxxxxxxxxxxxxx>, oracle-l@xxxxxxxxxxxxx
  • Date: Fri, 25 Aug 2006 15:17:13 -0500

shutdown abort is fine.

Shutdown immediate sometime succeeds. I mean the SELECT automatically fail
over to the second node and continues to run.

The confusion is that why it is failing sometime



On 8/25/06, Bobak, Mark <Mark.Bobak@xxxxxxxxxxxxxxx> wrote:

What happens if you do a shutdown abort instead...?


*--* *Mark J. Bobak* *Senior Oracle Architect* *ProQuest Information & Learning*

Ours is the age that is proud of machines that can think and suspicious of
men who try to.  --H. Mumford Jones, 1892-1980


------------------------------ *From:* oracle-l-bounce@xxxxxxxxxxxxx [mailto: oracle-l-bounce@xxxxxxxxxxxxx] *On Behalf Of *Anurag Verma *Sent:* Friday, August 25, 2006 1:17 PM *To:* oracle-l@xxxxxxxxxxxxx *Subject:* [SPAM] RAC Failover testing *Importance:* Low



Hi,


I am doing Testing on Oracle 9i RAC databases now. Ours is a 2-node RAC on IBM HACMP with GPFS.

The TNS entry on the client side, I use is given below:

MYDB =
  (DESCRIPTION =
   (ENABLE=BROKEN)
   (FAILOVER=ON)
   (LOAD_BALANCE=ON)
   (ADDRESS_LIST =
    (ADDRESS = (PROTOCOL = 
TCP)(HOST=RAC1.NODE1.com<http://rac1.node1.com/>)(PORT=1521))

    (ADDRESS = (PROTOCOL = TCP)(HOST=RAC2.NODE2.com<http://rac2.node2.com/>
)(PORT=1521)))
     (CONNECT_DATA =
      (SERVICE_NAME=FNWFI1)
      (FAILOVER_MODE=(TYPE=SELECT)(METHOD=BASIC)(RETRIES=250)(DELAY=5))
    )
  )


For Failover testing, I connected to the first node, and ran a long SELECT query. When the SELECT runs, I shut down the instance in the first node, with "shutdown immediate" command.

I get the following message.

======================================================================
ERROR:
ORA-01089: immediate shutdown in progress - no operations are permitted

960 rows selected.

SQL>
SQL> ex
ERROR:
ORA-25408: can not safely replay call
======================================================================
Here above, if I do not exit and check the V$instance, it shows the
session got switched over
to the surviving instance, even though the SELECT got stopped.

Why this is happening, as I am using (ENABLE=BROKEN) and (TYPE=SELECT)
options???


Surprisingly, the test succeeds on some other servers with the same DB Physical structure and memory parameters, by automatically failing over the SELECT query to the surviving node, and without the query being halted.

Is there any Operating System setting we need to set for successful
failover?

Thanks in advance,


Best Regards, Anurag

Other related posts: