Sometimes people from the Windows world think of rsync as just a tool for synchronizing laptop files, and although it can be used for that (see Using rsync to update laptop by Dirk Hart), it's also a general purpose copying tool that is worth learning about.
Important: rsync is not quite analagous to Windows Briefcase. You can get similar functionality by some careful double invocation, but there are conditions that really can't be handled well. Of course that's true for Briefcase also, and for the same reasons.
You can learn quite a bit about rsync and how it works right on your own machine: no network necessary. That's actually a good way to learn: it's quick, and you can easily see the results.
Like so many other Unix tools, rsync is often used at a very basic level without taking advantage of its more powerful features. Of course, because it is powerful and complex, people sometimes make the opposite mistake: thinking rsync is going to do something that it doesn't do. Trust me: I've made both those errors.
But simple is of course the place to start. So let's create a couple of directories to work with. I'll put mine in /tmp, but you can do whatever you find convenient.
cd /tmp mkdir a b date > a/froma date > b/inb rsync a/* b
The "b" directory now has been updated with files from a. That's as simple as you can get, but it's really no different than a copy in this case. Notice that a has NOT been updated with files from b. That's just like copy, in spite of the "sync" in the name.
But if we were doing this across a network, with either b or a on a different machine, even this simple invocation does have advantages over an rcp or scp. On a local (same machine) copy this will not happen, but over a network, rsync will transfer only the parts of a file that have actually changed. This is powerful stuff for large files and slower connections. It means that a giant log file only actually transfers the newest line. The algorithm that accomplishes this uses a "rolling checksum" and is well described at https://olstrans.sourceforge.net/release/OLS2000-rsync/OLS2000-rsync.html
If you were transferring to a remote machine, your syntax would be:
rsync -e a/* user@otherbox:/tmp/b or rsync -e user@otherbox:/tmp/a/* b
That would use ssh for the actual copying, if you have to use rcp you would leave off the -e and the user name.
Let's make some more files in a:
ln a/froma a/lnfroma ln -s a/froma a/symlnfroma rsync a/* b
We get an error message saying that the "symlnfroma" was skipped, but it looks like the other one copied. If you look more closely though, you'll see that it may not have done what you wanted: the files "froma" and "lnfroma" over in b are not hard links any more. Let's try again:
rsync -lH a/* b
That copied the symlink (-l) and fixed the hardlink (-H). Notice that the symlink does NOT refer to the file in b: it points to "a/froma". If you were copying to a directory of the same name in some other hierarchy or on another machine, that would be exactly what you would want.
Most of the time, rsync is used with -a (archive), which combines a number of other options:
Note that -H is not included, and that's because it can be time consuming to figure out hard links. Expect rsync to run longer if you need to use -H. Also realize that ownership and group changes, as well as device file copying, need root permissions.
Another often used flag not included in -a is -u, which says not to overwrite newer files. If we wanted to do a Briefcase style synchronization, we need -u, and we need to do the rsync in two directions:
rsync -aHu a/* b rsync -aHu b/* a
and that is no good if files were updated in both places (that's not an easy problem for any such program though).
Less oftenly used is the -b flag, which creates backups of files it copies:
rsync -aHub a/* b
If you have followed along with the examples as given, that should have made no apparent change in a or b. But try this:
date > a/froma rsync -aHub a/* b
Now b will have its older "froma" backed up to froma~. You can change the suffix, and you can have the backup files put in a different directory if you like.
Sometimes you want to delete files that no longer exist in the source:
rm a/inb rsync -aHubv --delete a/* b
That would probably be desirable for our Briefcase synchronization also.
A very useful flag when testing rsync is "-n". This just shows you what would be done (add -v if you want to see actual file names) without actually transferring anything:
rm b/* rsync -aHubnv --delete a/* b
Nothing will actually be transferred to b, but you will see what it would do. That can also be useful for other things: it can be used to verify the integrity of files thus providing a function similar to Tripwire. Let's say you copy files from one machine to another. You could then use rsync with the -n option on the "safe" machine. If nothing has changed, the files are still untouched.
This is NOT really equivalent to products like Tripwire but it can be useful in some situations.
If you have large files and slow links, -z will compress data.
There are even more esoteric flags; check the man page if you have more unusual needs. Whatever you need, it's probably there. Rsync is a powerful tool that is well worth learning.
Got something to add? Send me email.
More Articles by Tony Lawrence © 2009-11-06 Tony Lawrence