fluidBlog

August 24, 2007

Website Backup Strategy, Part III

Filed under: Backups, Deployment, Drupal, Fedora, Linux — trekr @ 7:08 am

Web Accessible Files

In Part II Revision Control I mentioned my preference for git over other revision control systems. One reason is that git is guaranteed to give you back what you put into it because it computes sha1 and notices things like disk corruption. Recall, I mentioned in the section on svn that you might want to mirror your svn repository with svnsync. That only addresses corruption that may occur after a commit to the respository, it doesn’t do anything for you if the corruption occurs during he commit. The other nice aspect of git is that it is distributed so that every repository is in a sense a backup if we can match the signature. For these reasons, the next part of our backup strategy will backup up the entire set of files accessible to the web, including those under revision control.

As I mentioned in part I Website Backup Strategies, the snapback technique is ideally suited for website backups because the files are not expected to change very often. The snapback technique is well documented on Mike Rubel’s site. It takes advantage of hard links to reduce space and rsync to reduce bandwidth for files that don’t change often over the backup time horizon. The method depends on having a /usr/bin/cp that can make hard links, like GNU cp -al does.

As I mentioned in part I, there is an excellent implementation called snapback2 from Perusion.

In this post, I’m going to walk through an example where a local machine is used as a backup server to access the remote web server via ssh that hosts the site we want to backup. If you don’t have ssh setup, see my post Securing a new Fedora 6 Slice. The local server needs to have Perl installed and snapback2 depends on Config::ApacheFormat. The snapback2 README covers installation, and the Perusion site also has online documentation. The documentation is also installed as a manpage.

$ cd /usr/local/src

$ /usr/bin/wget \
> http://search.cpan.org/CPAN/authors/id/M/MI/MIKEH/Snapback2-0.913.tar.gz

$ /bin/tar xvf Snapback2-0.913.tar.gz

$ cd Snapback2-0.913

$ /usr/bin/perl Makefile.PL

$ /usr/bin/make

$ /usr/bin/make test

$ /usr/bin/make install

On Fedora, you can set up Perl to get modules from CPAN

$ /usr/bin/yum install perl-libwww-perl
$ /usr/bin/perl -MCPAN -e shell
> get Bundle::CPAN

Then installing a new module can be as simple as typing

$ /usr/bin/perl -MCPAN -e 'install Config::ApacheFormat'

To configue snapback2, you use an apache-like configuration file, /etc/snapback/snapback.conf

    Hourlies    6
    Dailies     7
    Weeklies    4
    Monthlies  12
    AutoTime   Yes

    AdminEmail webmaster@mysite.com
    LogFile    /home/myuser/var/log/snapback.log
    Exclude *debug
    Exclude core.*
    SnapbackRoot /etc/snapback
    RsyncShell '/usr/bin/ssh -i /home/myuser/.ssh/id_dsa -p 2222'

    Destination /home/myuser/mnt/backup1

    <Backup mysite.com>
      Directory /var/www/html/mysite/
    </Backup>

Note the RsyncShell directive used to set up an ssh connection with a key and port number.

Next we need to setup a cron job. See this intro if you unfamiliar with cron.

    $ /usr/bin/crontab -e

Add the line

10 */4 * * * /usr/bin/snapback2

Note that the configuration file specified Hourlies 6 and the cron is set up to run every 4 hours, covering 24 (6 * 4) hours with six backups every four hours.

Because cron is not run under myuser, the ssh keys should not have a passphrase unless you are using something like keychain. If you use passphrase, be aware that the process will not survive a reboot automatically, you will have to reenter the passphrase after a reboot. Obviously, the keys without a passphrase must be safeguarded.

In the next part, I’ll detail how to set up something very similar to snapback to backup a MySQL database using bash scripts on the local and remote servers.

Part I Part II Part IV

No Comments »

No comments yet.

RSS feed for comments on this post. TrackBack URL

Leave a comment

Hakota Design LLC