Re: diff between incremental and archive backups

  • From: Andrew Kerber <andrew.kerber@xxxxxxxxx>
  • To: gogala.mladen@xxxxxxxxx
  • Date: Sat, 14 Nov 2015 07:38:40 -0600

I am not sure where you get the idea that most places don't use incremental
backups. That is the opposite of my experience. Many companies prefer to avoid
the additional licensing costs of replication. And even those that do use
replication really prefer to have multiple recovery options. In addition,
incremental backups are typically fully under the control of the DBA team,
while other storage based methods typically require coordination with system
and storage teams, which can make recovery much more complicated. In my
experience the only places that do not use incremental backups are on SE and
don't have it available, though they may refer to their archivelog backups as
incrementals.

Perhaps you are more used to dealing with larger companies, with large staffs
and hardware and software budgets that can spend the time and personnel to
implement and validate other methods.

Sent from my iPad

On Nov 14, 2015, at 12:21 AM, Mladen Gogala <gogala.mladen@xxxxxxxxx> wrote:

On 11/13/2015 01:59 PM, Zelli, Brian wrote:
So we were having a discussion about rman incremental backups. And the
question came up that if I do an rman full once a week and then rman back up
of the archive logs the rest of the week, that’s all I need to do a point in
time restore. I don’t have to do incremental backups. Is this an accurate
assumption?



Brian

Brian, the first thing to ask is what is the purpose of the incremental
backups. Hopefully, you are aware of the fact that backup is an expensive
operation. In order to not burden their users unnecessarily during the peak
working hours, which were usually during the working week, people have
resorted to incremental backups. The price to pay for a quick backup during
the week was longer restore and recovery time. If your database croaks on
Thursday, you need to restore the full backup taken on Saturday and all the
incremental backups from Sunday to Wednesday. After that, you need to do a
recovery.
However, today is backup the last line of defence. The foremost mechanism to
ensure quick recovery is duplication. Typically, production database will be
protected by some duplicating mechanism. There are different mechanisms:
storage and VM snapshots, disk replication like SRDF, HUR (Hitachi Universal
Replicator), SnapVault/SnapMirror, standby databases or incremental backup in
the FRA, which is constantly recovered. Most of those duplication methods
allow quick and painless backup. If you backup your standby database, you can
restore it to production. RMAN catalog will recognize the standby and
production as the same database, because they have the same DBID. It is also
possible to mount disk snapshots or disk replica and do either a file system
backup or rman backup. Be aware that there is usually some scripting involved
or a product which can do that by itself. In case of a replica, the
consideration of slowing down users is eliminated. Users are not using
standby database. So, if you do a full backup of a replica every day, you are
not obstructing the business process, you are speeding the recovery. The only
obstacle left is storage. If you have 20TB database, which is not
particularly gigantic these days, than keeping 7 full backups on disk would
require 140TB of disk storage, which still costs a pretty penny. That is
where modern deduplication algorithms come into play. They are typically much
more efficient than compression and achieve around 90% savings, as opposed to
compression which achieves around 70%. That efficiency comes at a cost,
usually requires SSD and 10GB network, but you will need at least 10GB
network any way if you want to backup 20TB database in less than a week. So,
with 90% deduplication efficiency, your full backup shrinks down to 2TB,
which is about the same space as consumed by the incremental backup.
So, you have now solved the question of strategy: some kind of duplication
will be needed and a proxy backup of the replica, with deduplication added
into the mix. The only remaining question is now the one of SLA: what is the
time that your CIO will give you to restore a 20TB database. This database is
likely powering a web application which should be available 24x7x365. Twelve
hours offline is usually unacceptable. If PayPal of Ebay ever went offline
for the full 12 hours, they would be destroyed by the competitors. So, switch
to standby or revert from snapshot will speed things immensely, but you will
still have to restore 20TB in 4 or 5 hours, which is quite a feat. For that,
you will need storage attached with a FC adapter and a good SAN, with a lot
of cache. Cheapo NAS storage cannot do the trick.
Long story short, you will need to spend a lot of money on hardware, SAN
licenses, Oracle licenses, infrastructure and alike. Forget the incremental
backups, almost nobody does that any more. If somebody is doing incremental
backups, that is because of ignorance, not because of some "secret sauce"
which will speed up backups. Good backup strategy is neither simple nor cheap
and is highly dependent on your business needs.

--
Mladen Gogala
Oracle DBA
http://mgogala.freehostia.com

Other related posts: