Resend : Fwd: Is a SUSPEND really necessary with EMC SnapView

Subject: Is a SUSPEND really necessary with EMC SnapView

There has earlier been discussion [with me asking questions about
SnapShot/SnapCopy implementations and later also responding to questions]
about how an Oracle Hot Backup is done with SnapShot/SnapClone mechanisms.
In my organisation I do have a few SnapClone implementations on Hitachi and
EMC SANs. I use BEGIN BACKUP and END BACKUP before and after the split but
donot use a SUSPEND.
Recently a colleague of mine tested an EMC SnapView SnapClone of a
productiondatabase using the steps on primary   BEGIN BACKUP    split   END
BACKUP on secondary    STARTUP MOUNT  {OPEN  fails with Recovery Required,
asexpected}    RECOVER DATABASE    OPEN    Run "dbv" on all datafiles
However, later, when we started querying the clone data we found corrupt
indexes.  ANALYZE  TABLE VALIDATE STRUCTURE CASCADE failed for a few tables.
That is when I came in to the picture.  I found an EMC doc on 8i  [and also
another doc on 9i the EMC engineer sent me] specifically state why a SUSPEND
is required.  Both EMC engineers at my site categorically stated that they
use BEGIN BACKUP and END BACKUP but not a SUSPEND at other sites.   Yet the
EMC docs state that a SUSPEND is required.
How have your experiences been ?
{as for the "corrupt database" I have asked the DBA, SysAdmin and EMC
engineers to schedule another test, still without the SUSPEND as the EMC
engineers swear that it is not required}.
'pdf[1] Page 16 "The use of ALTER SYSTEM SUSPEND is often questioned in
backup scenarios where use of different SNAP or mirror-splitting
technologiesis leveraged to perform ?instantaneous?, or very rapid, data
duplication. With hot backups, the physical data content of the various
Oracle files continue to change even after a tablespace has been placed into
hot backup mode. Oracle relies on the ordering sequence of how various OS
writes to the files are organized to ensure that the logical content
relationship of the files on durable media allow a correct recovery to be
performed in the event of unexpected server or storage system failures. When
the Oracle files are distributed over a number of system disk devices, a
common practice in most Oracle deployments to minimize the impact of single
device failure, and to improve general I/O performance, the different
?devices? have to be duplicated together. However, when we are starting the
SnapView sessions on the different devices, they are not started atomically.
Timing windows may exist as a result. The set of Oracle files being snapped
may appear to have lost the required I/O order sequencing. The ALTER SYSTEM
SUSPEND command suspends physical I/Os to the various Oracle database files
until ALTER SYSTEM RESUME is executed. With I/O suspended to the various
database files, there will be a temporary quiescence of OS level I/O to the
various Oracle files. During this window, the physical content of all the
Oracle files would be content-consistent. When all the required SNAP
sessionsare successfully started within this window, everything should then
be working correctly."

I realise that my colleague, when attempting the RECOVER database, probably
used the Online Redo Logs
which were actually 'fuzzy' and that he should have issued an ALTER SYSTEM
and BACKUP CONTROLFILE after the END BACKUP and then used the ArchiveLog and
Controlfile to
run an incomplete recovery.

Yet I still wonder why the EMC document states that a SUSPEND is necessary
while the EMC engineers
say that they don't use a SUSPEND.   I am also setting up an IBM ESS
FlashCopy and all the IBM docs I see
use the BEGIN BACKUP and END BACKUP, not a SUSPEND  {only one uses a SUSPEND
to split a PPRC
Hemant K Chitale
Oracle 9i Database Administrator Certified Professional




