You might want to loko at this: http://kevinclosson.wordpress.com/2006/12/09/testing-rac-failover-be-evi l-make-new-friends/ >>>The only exceptions are the Oracle binaries and the admin >>>stuff (bdump, udump...) - I thought you said everything :-) >>>know... decision was made before my time - it's not Ferrari, >>>but performance is not my concern at the moment). Even shared-nothing clustered databases can hold up under failures when they are idle. >>> >>>1. FO: Single primary failure [instance abort, SIGKILL >>>listener, crs, cssd, power off a node, interconnect NIC >>>failure (not bonded), HBA failure - how? ] killing stuff==simple, not a challenge. Power-off a node? Simple, not a challenge. Linux RAC without a bonded interconnect is not RAC. That is a single point of failure for the entire cluster. HBA? Pull both paths from their GBICs. >>>4. FO: Cascading failures [?] yes >>> >>>5: DR [srvctl stop database -d <...> -o abort and FSFO to a >>>target standby under 2min, change the threshold and manual >>>FO and switchback, FO to a non-target standby] good practice! >>> >>>6. Load test [basic built-in synthetic DG 5-hour load of >>>aprox. 3 MB/s, 10 tps - verify LB (server-side TAF, no FAN)] Thus far all your tests embody the simple clusters testing Oracle already does. This wont go much further either. >>> >>>7. Recovery [Loss of database, ocr (single copy and all), >>>voting disks (single copy and all), spfile, cf (single copy >>>and all), online redo (non-current/active, active, current - >>>all multiplexed copies), binary tree, physical and logical >>>corruptions (e.g. single table block corruptions with bbed), etc. Good practice. >>> >>>8. Application load and stress tests (qaload?) Good idea. >>> >>> >>>Any evil thoughts (but not evil enough to kill the system >>>completely, as I have to hand it over at the end of a week)? Read the blog. -- //www.freelists.org/webpage/oracle-l