Backup and recovery, Oreilly, 2006
[*][*][*][*][*][*][*][*][ ][ ] => [8] [*][*][*][*][*][*][*][ ][ ][ ] => [7] Intermediate => Advanced
I backup; therefore, I will be.
======================================================================= | Types of coverage | Computer backups | ----------------------------------------------------------------------- |Minimum coverage | Regular nightly backups (keeps you from losing | | | you job when a disk drive dies) | ----------------------------------------------------------------------- |Unexpected disasters| Journaling filesystemsUninterruptible Power | | | Supplies (UPSs) | ----------------------------------------------------------------------- |Get me driving now | RAIDMirroringUsing hot-swap drivesHigh-availabi| | | lity (HA) system | ----------------------------------------------------------------------- |Major disasters | Sending copies of your backup volumes to off | | | site storage, in case both your computer and | | | media library are destroyedSending your backups| | | via a dedicated network to a large storage | | | system at your off-site storage vendor | ----------------------------------------------------------------------- |Maximum protection | Real-time mirroring to a hot-swappable system | | | to another of your sitesSending your backups | | | via either network or courier to a hot-site | | | vendor | =======================================================================
The biggest part of the problem is misinformation. Most people simply don't know what is available. The six important questions that you have to continually ask yourself and others are why, what, when, where, who and how:
Why are you protecting yourself against disaster? Does it really matter if you lose data? What will the losses be? What is the value of your data?
What are you going to backup?, the entire box or just selected drives or file systems? What operating systems are you going to back up?
When is the best time to backup your system? How often should you do a full backup? When you should do an incremental backup?
Where will be the backup occur? Where is the best place to store the backup volumes?
Who is going to provide the hardware, software, and installation services to put this system together?
How are you going to accomplish it? There are a number of different ways to protect yourself against loss. Investigate the different methods, such as off-site storage, replication, mirroring, RAID, and the various levels of protection and provides.
Make sure you know how many and how are distributed your devices. Unix and Mac Os systems record this information in the message file, and Windows stores it in the registry, so hopefully you're backing those up.
Typically, this partition information is not saved anywhere on the system, so you must do something special to record it. On a Solaris system, for example, you can run a prtvtoc on each drive and save that to a file. Search on the Internet for scripts for capturing this information; a number of such free utilities exist.
I used Logical Volume Manager for months before hearing about the lvmcfgbackup command (it backs up the LVM's configuration information). Sometimes if you have this properly documented, you may not need to restore at all. For example, if the operating system disk crashes, simply put the disks back the way they were and then rebuild the stripe in the same order, and the data should be intact. I've done this several times.
Good documentation is definitely part of the backup plan. It should be regularly updated and available. No one should be standing around saying “I haven't set up NIS/AD/NFS from scratch in years. How do you do that again? Has anyone seen my copy of O'Reilly's book?” Actually, the best way to do this is to automate the creation of new servers. If your operating system supports it, take the time to write scripts that automatically install various services, and configure them for your environment. Put these together in a toolkit that is run every time you create a new server. Better yet, see if your OS vendor has any products that automate new server installations, such as Sun's Jumpstart, HP's Ignite-UX, Linux Kickstart, and Mac OS cloning features.
You need to be very familiar with every box, what it does, and what's on it. This information is vital so that you can include any special backups for that type of system.
The first argument that is typically stated as a plus to the selected-filesystem method is that you back up less data. People of this school recommend having two groups of backups: operating system data and regular data. The idea is that the operating system backups would be performed less often. Some would even recommend that they be performed only when you have a significant change, such as Windows security patches, an operating system upgrade, a patch installation, or a kernel rebuild. You would then back up your “regular” data daily.
The first problem with this argument is that it is outdated; just look at the size of the typical modern system. The operating system/data ratio is now significantly heavier on the data side. You won't be saving much space or network traffic by not backing up the OS even on your full backups. When you consider incremental backups, the ratio gets even smaller. Operating system partitions have almost nothing of size that would be included in an incremental backup, unless it's something important that should be backed up! This includes Unix, Linux, and Mac OS files such as /etc/passwd, /etc/hosts, syslog, /var/adm/messages, and any other files that would be helpful if you lost the operating system. It also includes the Windows registry. Filesystem swap is arguably the only completely worthless information that could be included on the OS disk, and it can be excluded with proper use of an exclude list.
Proponents of piecemeal backup would say that you can include important files such as the preceding ones in a special backup. The problem with that is it is so much more difficult than backing up everything. Assuming you exclude configuration files from most backups, you have to remember to do manual backups every time you change a configuration file or database. That means you have to do something special when you make a change. Special is bad. If you just back up everything, you can administer systems as you need to, without having to remember to back up before you change something.
One of the very few things that could be considered a plus is that if you split up your drives or filesystems into multiple backups, it is easier to split them between multiple volumes. If a backup of your system does not fit on one volume, it is easier to automate it by splitting it into two different include lists. However, in order to take advantage of this, you have to use include lists rather than exclude lists, and then you are subject to the limitations discussed earlier. You should investigate whether your backup utility has a better way to solve this problem.
This one is hard to argue against. However, if you do take the time to do it right the first time, you never need to mess with include lists again. This reminds me of another favorite phrase of mine: “Never time to do it right, always time to do it over.” Take the time to do it right the first time.
In this scenario, the biggest benefits are that you save some time spent scripting up front, as well as a few bytes of network traffic. The worst possible side effect is that you overlook the drive or filesystem with your boss's budget that just got deleted.
Once you go through the trouble of creating a script or program that works, you just need to monitor its logs. You can rest easy at night knowing that all your data is being backed up.
You may increase your network traffic by a few percentage points, and the people looking after the wires might not like that. (That is, of course, until you restore the server where they keep their DNS source database.)
Backing up selected drives or filesystems is one of the most common mistakes that I find when evaluating a backup configuration. It is a very easy trap to fall into because of the time it saves you up front. Until you've been bitten though, you may not know how much danger you are in. If your backup setup uses include lists, I hope that this discussion convinces you to rethink that decision.
A full backup.
An incremental backup that backs up everything that has changed since the last level 0 backup. Repeated level 1 backups still back up everything since the last full/level 0 backup.
Each level backs up whatever has changed since the last backup of the next-lowest level. That is, a level 2 backs up everything that changed since a level 1, or since a level 0, if there is no level 1. With some products, repeated level 9 backups back up only things that have changed since the last level 9 backup, but this is far from universal.
Usually, a backup backs up anything that has changed since the last backup of any type.
Most people refer to a differential as a backup that backs up everything that has changed since the last full backup, but this is not universal. In Windows, a differential is a backup that does not clear the archive bit. Therefore, if you run a full backup followed by several differential backups, they act like differential backups in the traditional sense. However, if you run even one incremental backup in Windows, it clears the archive bit, and the next differential backup backs up only those files that have changed since the last incremental backup. That's why a differential backup is not synonymous with a level 1 backup.
I prefer this term to differential, and it refers to a backup that backs up all files that have changed since the last full backup.
Performing a level 0 backup every day onto a separate volume. (Please don't overwrite yesterday's good level 0 backup with today's possibly corrupt level 0 backup!) If your system is really small, this schedule might work for you. If you have systems of any reasonable size, though, this schedule is not very scalable. It's also really not that necessary with today's commercial backup software systems.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Sunday |Monday |Tuesday |Wednesday |Thursday |Friday |Saturday | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Full/0 |Full/0 |Full/0 |Full/0 |Full/0 |Full/0 |Full/0 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
The advantage of this schedule is that throughout most of the week, you would only need to restore from two volumes the level 0 and the most recent level differential/level 1. This is because each differential /level 1 backs up all changes since the full backup on Sunday. Another advantage of this type of setup is that you get multiple copies of files that are changed early in the week. This is probably the best schedule to use if you are using simple utilities such as dump, tar, or cpio because they require you to do all the volume management. A two-volume restore is much easier than a six-volume restore trust me!
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Sunday |Monday |Tuesday |Wednesday |Thursday |Friday |Saturday | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Full/0 |Diff/1 |Diff/1 |Diff/1 |Diff/1 |Diff/1 |Diff/1 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If your backup product supports multiple levels, you can use this schedule. The advantage to this schedule is that it takes less time and uses less media than the preceding schedule. There are two disadvantages to this plan. First, each changed file gets backed up only once, which leaves you very susceptible to data loss if you have any media failures. Second, you would need six volumes to do a full restore on Friday. If you're using a good open-source backup utility or commercial backup utility, though the latter is really not a problem, because these utilities do all the volume management for you, including swapping tapes with an auto-changer.
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Sunday |Monday |Tuesday |Wednesday |Thursday |Friday |Saturday | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Full/0 |1 |2 |3 |4 |5 |6 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
One of the most interesting ideas that I've seen is called the Tower of Hanoi (TOH) backup plan. It's based on an ancient mathematical progression puzzle by the same name. The game consists of three pegs and a number of different-sized rings inserted onto those pegs. A ring may not be placed on top of a ring with a smaller radius. The goal of the game is to move all of the rings from the first peg to the third peg, using the second peg for temporary storage when needed
A goal of most backup schedules is to put changed files on more than one volume while reducing total volume usage. The TOH accomplishes this better than any other schedule. If you use a TOH progression for your backup levels, most changed files are backed up twice but only twice.
0 3 2 4 3 5 4 6 5 7 6 8 7 9 8
This mathematical progression is actually pretty easy. It consists of two interleaved series of numbers (e.g., 2 3 4 5 6 7 8 9 interleaved with 3 4 5 6 7 8 9).
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Sunday |Monday |Tuesday |Wednesday |Thursday |Friday |Saturday | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |0 |1 |2 |3 |4 |5 |6 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
It starts with a level (full) on Sunday. Suppose that a file is changed on Monday. The level 3 on Monday would back up everything since the level 0, so that changed file would be included on Monday's backup. Suppose that on Tuesday we change another file. Then on Tuesday night, the level 2 backup must look for a level that is lower, right? The level 3 on Monday is not lower, so it references the level 0 also. So the file that was changed on Monday, as well as the file that was changed on Tuesday, is backed up again. On Wednesday, the level 5 backs up only what changed that day, because it references the level 2 on Tuesday. But on Thursday, the level 4 does not reference the level 5 on Wednesday; it references the level 2 on Tuesday.
Note that the file that changed on Tuesday was backed up only once. To get around this problem, we use a modified TOH progression, dropping down to a level 1 backup each week
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Day of the week |Week one |Week two |Week three |Week four | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Sunday |0 |1 |1 |1 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Monday |3 |3 |3 |3 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Tuesday |2 |2 |2 |2 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Wednesday |5 |5 |5 |5 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Thursday |4 |4 |4 |4 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Friday |7 |7 |7 |7 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ |Saturday |6 |6 |6 |6 | +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
If it doesn't confuse you and your backup methodology,[] and if your backup system supports it, I recommend the schedule depicted in Table 2-5. Each Sunday, you get a complete incremental backup of everything that has changed since the monthly full backup. During the rest of the week, every changed file is backed up twice except for Wednesday's files. This protects you from media failure better than any of the schedules mentioned previously. You will need more than one volume to do a full restore, of course, but this is not a problem if you have a sophisticated backup utility with volume management.
“This is always the case for any recommendation in this book. If it confuses you or your backup methodology, it's not good! If your backups confuse you, you don't even want to try to restore! Always keep it simple, system administrator (K.I.S.S.).”
Unix, Linux, and Mac OS systems record three different times for each file. The first is mtime , or modification time. The mtime value is changed whenever the contents of the file have changed, such as when you add lines to a logfile. The second is atime , or access time. The atime value is changed whenever the file is accessed, such as when a script is run or a document is read. The last is ctime , or change time. The ctime value is updated whenever the attributes of the file, such as its permissions or ownership, are changed.
Administrators use ctime to look for hackers because they may change permissions of a file to try to exploit your system. Administrators also monitor atime to look for large files that have not been accessed for a long time. (Such files can be archived and deleted.)
You may be wondering what this has to do with backups. You need to understand that any backup utility that backs up using the filesystem modifies atime as it reads the file to back it up. Almost all commercial utilities, as well as tar , cpio, and dd,[§] have this feature. dump reads the filesystem via the raw device, so it does not change atime.
dd has this feature when you're using it to copy an individual file in a filesystem, of course. When using dd to copy a raw device, you will not change the access times of files in the filesystem.
A backup program can look at a file's atime before it backs it up. After it backs up the file, the atime obviously has changed. It can then use the utime system call to reset atime to its original value. However, changing atime is considered an attribute change, which means that it changes ctime. This means that when you use a utility such as cpio or gtar that can reset atime, you change ctime on every file that it backs up. If you have a system that is watching for ctime changes, it will think that it's found a hacker for sure!
Make sure that you understand how your utility handles this issue.
For all versions of Windows since NT, ntbackup is the only native choice for a traditional backup application, although you should also be familiar with System Restore. Mac OS X users running a version greater than 10.4 have a number of Unix-based backup tools available to them, including cpio, tar, rsync, and ditto. For commercial Unix systems, dump and restore are quite popular, but they're not considered a viable option on Linux. dump is available on Mac OS, but it doesn't support HFS+. After dump and restore, the native backup utility with the most features is cpio, but it is less user friendly than its cousin tar. tar is incredibly easy to use and is more portable than either dump or cpio. The GNU versions of tar and cpio have much more functionality than either of the native versions. If you have to back up raw devices or perform remote backups with tar or cpio, dd will be your new best friend. Finally, rsync can be used to copy data between filesystems on Windows, Mac OS, Linux, and Unix.