rsync is a command line utility available for UNIX and OSX that copies files. And it does it well. In fact, I have it set as an alias for the regular
cp command in my
alias cp='rsync -ae ssh'
This means that whenever I type
cp file other_file it gets translated to
rsync -ae ssh file other_file.
In its most basic form,
rsync behaves just like
cp. However, it comes with a huge list of command line arguments (you should go and read
man rsync if you get the chance) that unlock neat features. Here are my favorites:
rsync --progress -h
With these flags, a progress bar will be shown during the transfer (
-h makes it human readable) so you will get an estimate of when the transfer of a big file will finish.
rsync -e ssh
rsync will use SSH to copy files across machines! For example:
rsync -e ssh files_to_copy email@example.com:where/you/want/it/
or if you have configured your SSH properly (see a previous post) it simply becomes:
rsync -e ssh files_to_copy remote:where/you/want/it/
Of course, it can also download files from a remote machine:
rsync -e ssh remote:files_to_copy ./where/you/want/it/
A nice feature of
rsync is that it will detect whether two files are identical, so it will not bother to re-copy them. This means you can usually be lazy and just copy whole directories.
Another neat flag is:
rsync --append huge_file.dat some/other/place/huge_file.dat
--append flag will make
rsync detect whether part of the file has been copied earlier and start where it left off. This is great when downloading large files over a shaky wifi connection. More than once I’ve started a download during a meeting/lecture/talk, closed my laptop when the talk is over, and later resumed the download in my office.
And finally, this is an important one as well:
rsync -a -e ssh files_to_copy remote:where/you/want/it
-a flag means that rsync will do its best to set the correct owner and group for the remote files, as well as preserve the file permissions on it. This is almost always what you want.
By default, rsync will detect whether the destination file already exists. In the case where you are transferring the file over a network, rsync will attempt to be clever and only transmit the differences between the source and destination file. This can speed up things tremendously. However, when transferring large (multiple GB’s) files over a very fast network connection, the computation of the differences between the files becomes a performance bottleneck and you will notice very low transfer speeds (only several MB/s). In this case, tell rsync to don’t bother trying to be clever and just transfer the entire file with the
rsync -W -e ssh big_file.dat firstname.lastname@example.org:big_file.dat
Set it as an alias
Nobody would expect you to actually remember and type out the above commands. Just set an alias for some scenarios that occur often by appending the following to your
alias cp='rsync -ae ssh' alias cpv='rsync -vhae ssh --progress' alias cpa='rsync -vhae ssh --progress --append'
With this in place, you can just treat
rsync as an improved version of the
$ cpa remote:/data/big_recording.fif . receiving incremental file list big_recording.fif 1,138,341,699 100% 81.11MB/s 0:00:13 (xfr#1, to-chk=0/1) sent 30 bytes received 1,138,480,768 bytes 84,331,910.96 bytes/sec total size is 1,138,341,699 speedup is 1.00