I get this question frequently. It's usually triggered either because the tape device can't hold an entire backup set or because the time required for backup interferes with productive work. Most of the time this can be easily remedied by a larger or faster storage device, but someone is bound to bring up the idea of differential backups.
The idea is that you create a full backup that has everything, and from then on, you only backup the files that have changed. Presumably that's a smaller set of files and therefore this solves the space or time problems. Usually the full backup is refreshed on some schedule and the process starts again. There are variants on the theme; for example the differential may include all files that have changed since the last full backup rather than just those that have changed since the last differential. That sort of scheme eventually ends up with the differential containing any and all files that ever change, no matter how infrequently; the full backup is the source of everything else.
Often the term "Incremental" is used to describe what I call true differential. I'll use that term for the rest of this article. Remember that a Differential will always have everything that has changed since the last complete backup; an Incremental will only have files that have changed since the previous Incremental backup. Right after the full backup, an Incremental and a Differential would be exactly the same; after that they will probably contain different files. An Incremental CAN be smaller than a Differential but could never be larger.
Differential or Incremental backups always seems like a great idea to people who haven't experienced the negative aspects. Admittedly, there can be circumstances where you have no other choice, but consider these points:
Wherever possible, doing a complete, full backup every day is easiest and gives the most data redundancy. If you absolutely cannot do that, then the modified Incremental (everything modified since the last full backup) is better than true Incrementals. However, don't neglect having multiple full backups in either case.
By the way, my aversion to differential or incremental backups is based on many years of painful field experience. Although it is rare nowadays, not too many years ago I would be involved with drive failures about once a month: I have seen these problems for myself. I STRONGLY RECOMMEND FULL BACKUPS IF AT ALL POSSIBLE. Backup media gets larger and faster and cheaper ever year, so most people CAN do complete backups, and should.
While attractive in principle, the time element isn't all that good and you also lose several important capabilities:
Consider this also: you have set up rsync or whatever to keep two machines up to date. Now you have a memory or motherboard problem on the main machine that scrambles database data. It's not bad enough to crash instantly, but it is bad enough to damage the database extensively. That bogus data will of course get transferred to the other machine: effectively a hardware problem on one box causes the identical problems on the other.
Sometimes the easiest way to fix such a problem is to go back in time to a point where the data was not corrupted. This may be because it's too corrupt to fix with ordinary tools but more often it's just because it is too difficult to figure out where all the problems are: the only sure solution is to revert to some previous state. The ONLY way to do that is to have multiple sets of removable backup that extend backup in time.
Remember, I'm not saying that having the backup machine is a bad idea. It's not, and it can be very convenient. But you need removable media SOMEWHERE.
There are now inexpensive removable hard drives. They are still a little expensive, but you CAN do this.
Removable media is still the intelligent choice for backup and will remain so until solid state, non-volatile disk drives are common, and I'm not even sure if it's a bad idea then.
The problem here is two-fold: one, you probably can't back up ALL your data because the connection isn't fast enough and two, you are depending on the Internet being available for restore. I do think Internet backup is a great adjunct to in-house removable media, but that's all it is.
Maybe you have multiple redundant T3 connections and can do this, but even then, I think you should have in-house removable media for utmost safety.
Your data is critical. Don't put it at risk.
Got something to add? Send me email.
More Articles by Tony Lawrence © 2012-07-14 Tony Lawrence
In C++ it's harder to shoot yourself in the foot, but when you do, you blow off your whole leg. (Bjarne Stroustrup)