Re: creative use of storage snapshots.

  • From: kyle Hailey <kylelf@xxxxxxxxx>
  • To: niall.litchfield@xxxxxxxxx
  • Date: Mon, 20 Dec 2010 16:04:34 -0800

>>  it sounds to me like one of those great ideas
yes it's definitely an great idea who's time has come. This idea is being
harnessed by more and more companies across different industries - we've
be implemented this at Shopping.com (part of ebay), Tivo (consumer tech),
Corporate Express (retail), Tagged (social networking), Clearwire (telcom),
KLA Tencor (hardware), Holland America (travel) .   With copy on write file
systems and the right kind of code to manage the clones (virtual databases),
virtual copies of a database can be made in  minutes despite the size of the
data, and all the data blocks  are shared shared except those blocks that
have changes made to them.  I can make about  multiple virtual copies of a
TB size database in minutes and basically only use the amount of space it
takes for each to have their own private redo log files. After the virtual
database are provisioned and running each would keep private copies of any
data block that they modify.
Virtualizing databases saves tons of disk space and reduces the time to make
copies from hours, sometimes days to just minutes

I cringe when I talk to customers who don't have development or QA copies of
production. Usually developers work on subsets of production and often query
performance on these subsets don't reflect the performance in production.
 Even if production is a full copy of  development there is often just one
copy shared by all the developers and QA.  Any change to this copy  has to
go through  a number of hoops and signoffs which slows down the development
process and de-motivates developers. I don't know if you can imagine having
a copy per developer of production where each developer can make and test as
many changes   as fast as they want on an exact copy of production - it's
awesome and inspires the developers. After the developers are confident of
there modifications then they can submit the final change request for
review.

Delphix makes all of this easy with no changes on production and creation of
virtual databases just a few clicks of a mouse in GUI .

It can of course be done by hand as a few people have explained. Oracle
explained how to do it in a white paper:
http://www.oracle.com/technetwork/database/features/availability/maa-db-clone-szfssa-172997.pdf

Three ways that come to mind to create these virtual copies:

1. snapshoting the production files system with something like EMC's
snapview  or Netapps Flexclone or ZFS
2. doing the same thing on dataguard
3. using Delphix

Snaps on production could be problematic if they share the same file system
with production, QA and Dev.  A better case would be copying off the
snapshot clones to another filesystem, or mirroring the filesystem and
breaking the mirrors.

With dataguard, the solution, as proposed by Oracle, is to have a production
database, then secondary dataguard database, then copy the dataguard
database onto an Oracle 7000 series storage with zfs and then use zfs
snapshotting to provision databases. Of course requiring a special storage
device and an extra oracle as well as dataguard license is not cheap

For Delphix, (where i work), we don't modify production any. We copy over
the production database using an optimized RMAN API onto a Delphix appliance
which typically has a compression ratio of 2:1 or better (ie 50% or better).
The Delphix box then exports via NFS the database files which can be mounted
on other machines. Each machine gets a  private view of the datafiles, ie a
virtual database. All data is shared unless modifications are made.

Caveats

Space
Since modified blocks are copied and kept separate, then the more data
blocks that are changed by each virtual database and the longer the virtual
database lives before being refreshed from a newer snapshot of production,
the more space each virtual database will take. It's unlikely that they vdbs
will ever reach the size of the base database, but they could get close with
enough widely distributed changes. With Delphix both the original copy of
the database and all changed blocks are compressed, so the virtual database
is always much smaller than the original.

Performance
In the Delphix situation, performance is generally on par with production
given the same hardware. The biggest issue I see is people skimping on the
network connection and since the datafiles are accessed over NFS/TCP, any
skimping on the network will affect the "physical" read performance. Often
what Oracle thinks is a physical read is really a read from the
read cache on Delphix. The read performance on Delphix often shows "super
scalability" since multiple virtual database are using the same read cache
on the Delphix box and since most blocks are shared, the Delphix cache get's
gets loaded with blocks that end up benefiting other databases sharing that
same underlying database.

Management
As with anything, the more manual steps the more time consuming, brittle,
and prone to errors. Eliminating errors usually takes some time and
iterations. Its completely doable but it's nice to have it all packaged in
an easy UI and have someone else streamline, automate and stabilize the
process.



- Kyle
http://db-optimizer.blogspot.com/

Other related posts: