Re: ASM rebalance estimates

  • From: Karl Arao <karlarao@xxxxxxxxx>
  • To: Dave.Herring@xxxxxxxxxx
  • Date: Thu, 9 Sep 2010 23:58:18 +0800

Hi Dave,

I had a recent scenario where a client migrated their old EMC CX500 to CX4
which the LUNs of their 3 node RAC are also placed.

The initial plan was to do ASM disk rebalance all the way (add & drop disk
on a single command)... so we went on with the plan and once the command was
executed we all noticed that the estimated time for a 150GB database was
like 2 days!

... So we aborted the rebalance and did the plan B, it was short & sweet and
we made use of the SAN copy of the LUNs (header and metadata are the same)
and instantly they were recognized without problems.. RAC started up without
problems..

There is a way to do incremental SANCopy while the RAC environment is still
running.. only time you have to full shutdown is when you do the final sync
so the dirty blocks will be synced to the new devices... The whole activity
is just like restarting the whole RAC environment and pointing the server to
the new LUNs.. Bulk of the work will be on the storage engineer, in our case
the OCR and Voting Disk are on OCFS2 so we just need to edit the fstab with
the new EMC pseudo device names.. and for the ASM, we are using ASMlib.. and
the new devices although having different names still has the header and
metadata which the only two stuff that ASM cares about so when you boot the
machine up again it's as if nothing happened.

Note that you should not present the OLD and NEW LUNs together, because ASM
will tell you that it is seeing two instance of the disk.. :)

This is very ideal for large array migration and the business allows for a
minimal downtime window.. at a minimum one restart of the servers.. instead
of doing a full array migration using ASM rebalance which will take longer
and will make you worry if it's already finished or not and if you take this
route the bottleneck would be your CPUs consuming IO time (on full throttle
rebalance power 11) .. compared to SANCopy, it is all passing through the
Fiber (1TB = 1hour as per the engineer) and much faster.. :)

BTW, after the activity the "RMAN backup validate check logical" was okay..




-- 
Karl Arao
karlarao.wordpress.com
karlarao.tiddlyspot.com

Other related posts: