BackerUpper

Written by Kevin L. Ellis

Download current release


        BackerUpper was designed to be a very simple to setup and maintain backup system.  I wanted a backup program that was easy to configure, flexible in what was and wasn't backed up, and would also perform incremental backups.  I couldn't find what I was looking for so I created my own.

Features:

        BackerUpper is intended to be ran by crond a little after midnight every day and performs a backup of the previous days changes.  This is how the program performs incremental backups, by finding files that were changed the day before.  For example, if it's 12:10AM on March 22nd, the program will zip up all the files that were changed on March 21st, the day before.  The program performs a full backup of all files on the first of every month.  So at 12:10AM on March 2nd BackerUpper is started by crond and sees that the day before was the 1st of the month, so it'll perform a full backup.  The first time BackerUpper is ran it will also perform a full backup.

Programs need by BackerUpper to operate properly:

        All three of these should already be installed on your system.  If not you can find them on freshmeat.net.


Setup:

        BackerUpper is easily setup by changing it's configuration file backerupper.conf.  This file is normally located in the /etc directory. Each portion of the configuration file determines how BackerUpper operates.  There are only 3 configuration settings and 4 sections that you need to change in order to tell the program what files to backup.

BACKUP_DIR=/tmp

        This line tells BackerUpper where to create the backup file.  The backup file will be named by the month, day, and year that the backup relates to.  A full backup will be indicated as such.  For example, Feb.21.2002.full.zip, is a full backup created on 2/21/2002.  A backup file named Feb.22.2002.zip is an incremental backup that contains files changed on 2/22/2002.  Default is pointing to /tmp, but if you want to make a special directory just for backups set it to that (e.g. /backups).
 

TMP_DIR=/tmp

        This line tells BackerUpper what directory it should use for it's temporary files.  If you don't have any other users accessing your machine then the /tmp directory is fine.  If it's a machine that other users can access and you don't want them to see your directory structure and files, this should then point to a directory that only root can access.  In addition, if you use multiple configuration files, then you should have different TMP_DIR for each configuration.
 

FINAL_CMD=scp %s machine2:/backups/machine1

        This line is a final command that BackerUpper should run once the backup has been created.  The  %s is the name of the backup file.  This command can be any shell command or a script file if you'd like to do more complex post backup processing.  The example above copies the backup across a network to another machine using scp.  If you want to make a duplicate copy of the backup onto another hard drive that's mount at /drive2, you could give this configuration:

FINAL_CMD=cp %s /drive2

        You should not put the directory indicated in BACKUP_DIR here, just use %s and BackerUpper will know where the backup file is located.
 

INCLUDE DIRECTORIES

        Lines after this heading should be directories you want backed-up. Not only will this directory be backed-up, but every directory lower then this one will be backed up too. So if you put in:

/home

        Any files in the /home directory will be backed-up and every users directory and sub-directories will be backed-up too.
 

INCLUDE FILES

        Lines after this heading are single files that you want included in the backup. This allows you to backup a single file in any directory without backing-up every file in that directory. For instance, to backup the configuration file for BackerUpper you could put in:

/etc/backerupper.conf

        This would backup the configuration file without backing-up everything in /etc.
 

EXCLUDE DIRECTORIES

        Lines after this heading should be directories that you DO NOT want backed-up. Not only will this directory not be backed-up, but any sub-directories under this one will not be backed-up. This is used with the INCLUDE DIRECTORIES section so you can just backup sets of a directory heirarchy.

        For example, I want to backup all the users directories, so I would put in /home in the INCLUDE DIRECTORIES section, but I don't want to backup netscape's cache directory, so I could add:

/home/user1/.netscape/cache
 

EXCLUDE FILES

        Lines after this heading are single files that you DO NOT want backed-up. This can be used to exclude a single file from a directory that you have indicated for backup in the INCLUDE DIRECTORIES section.
 

        You can put comments in the configuration file by placing a # at the beginning of the line.  Also blank lines are recognized and skipped by BackerUpper when reading in the configuration file.


        Here's a sample configuration file:

BACKUP_DIR=/tmp 
TMP_DIR=/tmp
FINAL_CMD=scp %s machine2:/backups/machine1

INCLUDE DIRECTORIES
/home/user1

INCLUDE FILES
/etc/backerupper.conf

EXCLUDE DIRECTORIES
/home/user1/.netscape/cache

EXCLUDE FILES
/home/user1/huge.mp3

END

        The configuration file, backerupper.conf, should be placed in /etc - this is where backerupper will look for the file.  If you'd like to place the file somewhere else or have multiple configurations, then just use the -d command-line option.  


Command-Line Options:

-h, -H, -?    Prints out a quick help screen

-F            Force a full backup to be performed

-I            Force an incremental backup to be performed all the time, even for 1st of the month

-i            Force an incremental backup of files that changed today

-d conf_dir            Directory where configuration files are located

        The -d conf_dir command-line option lets you have multiple backup configuration settings.  The conf_dir should be the directory where your configuration files are located.  Normally this defaults to /etc, but by using -d you can create multiple configurations.  Just create the directory where you want to put the configuration file and then create the backerupper.conf configuration file.

-t M/D/YYYY      Fake last time a backup occurred

        Used to create a backerupper.last file in the conf_dir directory.  See Advanced Features for info in it's use.


        NOTE:  All command-line options MUST be specified separately!  You can't group them, they each must contain the dash '-' sign.   For example, you can't do: backerupper -di /etc/backups,  instead use:  backerupper -i -d /etc/backups.


Quick Start:

        BackerUpper is meant to be run as a cron job.  Here are the steps to quickly start using BackerUpper.
 

        1.        Decide what you want backed-up and modify the file backerupper.conf

        2.        Run the shell script backerupper.start which will perform the following steps:

                    1.    Move the executable backerupper to /usr/local/bin
                    2.    Move the configuration file backerupper.conf to /etc
                    3.    It will execute this command:

    echo "10 0 * * * /usr/local/bin/backerupper 1> /dev/null 2> /dev/null" >> /var/spool/cron/crontabs/root

                        This will add the appropriate statement to root's crontab file so that cron will run BackerUpper at 10 minutes after midnight every night.

                    4.    And lastly, it will execute this command:

         echo root >> /var/spool/cron/crontabs/cron.update

                     cron.update is the file cron looks for when there are changes to the crontab files.  This will tell cron that there are changes to the crontab file.
 

        After this everything should work without you have to do anything.  If your cron files are located elsewhere you might have to modify the script (This script works for SlackWare Linux).  At 10 minutes after midnight BackerUpper will start up and realize that it hasn't been run before.  It will therefore perform a full backup of the directories and files you indicated.  Then every night after this it will perform an incremental backup of just those files that changed the day before.  10 minutes after the midnight of the 1st of every month BackerUpper will perform a full backup again.

        If you ever want to stop BackerUpper then edit the /var/spool/cron/crontabs/root  file and remove the line for BackerUpper.  Once that's done execute this at the command line:

echo root >> /var/spool/cron/crontabs/cron.update
 


Detailed Setup:

        The best way to setup BackerUpper is to first determine what directories/files you want to backup.  Use the du command to determine if there are any sub-directories containing lots of data that you didn't expect to be backed-up so that you can exclude them.  Sometimes "hidden" directories (directories that start with a dot ".") will contain lots of data that you really aren't interested in backing-up.  For example, Netscape's cache is inside of the .netscape directory in your home directory.

        Once you've decided on what to backup and configured backerupper.conf then go ahead and run the backerupper.start script.  Once this is done then BackerUpper will start operating that night, but it's best to run backerupper for the first time from the command-line.  The reason I suggest this is there might still be some strange files that BackerUpper can't handle.  For example, the program dosemu sets up some pipe files in a hidden .dosemu directory.  Zip can't zip these files up and when it tries it just "hangs" and doesn't do anything else.  When BackerUpper is ran from the command-line you can see if the program completes normally or stops on certain files.  If it exits properly then it shouldn't have any problems running from cron. 



Advanced Features:

USE_BZIP2

        This feature is commented out in the configuration file, if you wish to use it uncomment the line.  When uncommented BackerUpper will use zip to store the files in an uncompressed format (option -0) and will then compress that stored file with bzip2.  Most of the time this will give better compression then with standard zip.

USE_ZIP_UPDATE

        This feature is commented out in the configuration file, if you wish to use it uncomment the line.  When uncommented BackerUpper will use zip with the -u option to update a single incremental file.  Instead of having an increment backup file for each day of the month, BackerUpper will create a single incremental backup file and continually add files to it or update the files in it that have changed.  This is useful for backing up mail directories or log files.  Since these files change every day they would be included in EVERY incremental file for the month.  By using zip's -u option there will only be a single incremental file for the month that will everyday be updated with new versions of files already in the zip file and newly created files will be added.


-t M/D/YYYY

        When BackerUpper is run with this command line option it will create a
backerupper.last file in either the standard conf_dir or in the one specified by -d.  This option could be used with the -I option to perform only incremental backups from a certain time till the present.  One use for this would be to keep a backup of all the changes made to a system to assist in returning the system to an operation state very quickly.  For example, you want to build a firewall box, so you install the system software from a CD.  You also install other packages of software after you've install the system.  These files can be spread throughout the system, so it would be difficult to create a configuration file for them.  You could use the -t option and -I to capture only those files added to the system after the initial installation.




Tips & Tricks for Setting up BackerUpper:


        While BackerUpper is easy to setup  it is also very flexible.  To save the most amount of space when doing backups it's best to have multiple configuration files for different directories, tailored to the data being backed up.  As an example I'll use my system's configuration to show how I have it setup.  Currently I use two configuration files, a main and a mail configuration.  The backerupper.conf for main looks like:

#main configuration
BACKUP_DIR=/root/backerupper/main
TMP_DIR=/etc/backerupper/main
USE_BZIP2
INCLUDE DIRECTORIES
/root/webserver
/root/.ssh
/home
/usr/local/apache2/conf
/usr/local/apache2/etc
/usr/local/apache2/cgi-bin
/usr/local/apache2/htdocs
/usr/local/src
/usr/local/bin
/etc
INCLUDE FILES
EXCLUDE DIRECTORIES
/home/pics  
/home/pvr
/home/g29115/coxbook.log
/home/g29115/w-lcrew.org.log
EXCLUDE FILES
END

        And the crontab entry looks like:

10 0 * * * /usr/local/bin/backerupper -d /etc/backerupper/main 1> /dev/null 2> /dev/null
           
        The crontab entry will run BackerUpper at 12:10AM every night and it tells BackerUpper to look in /etc/backerupper/main for the configuration file.  The configuration file places the backup files in /root/backerupper/main and puts the temporary files in /etc/backerupper/main - the same place the configuration file is located.  I use bzip2 for compressing the backups because it gives better compression, thus saving me space.


        I also have a separate backup configuration for the mail spools.  The backerupper.conf for mail looks like:


#mail backups
BACKUP_DIR=/root/backerupper/mail
TMP_DIR=/etc/backerupper/mail
USE_ZIP_UPDATE
INCLUDE DIRECTORIES
# mailboxes
/var/spool/mail
/var/spool/virtual
INCLUDE FILES
EXCLUDE DIRECTORIES
EXCLUDE FILES
END

        And the crontab entry looks like:

55 23 * * * /usr/local/bin/backerupper -i -d /etc/backerupper/mail 1> /dev/null 2> /dev/null
       
        The crontab entry here is different then for the main backups.  Here I'm running BackerUpper at 11:55PM every night, just before midnight.  I'm also telling BackerUpper to just perform an incremental backup of data that changed today and to look in /etc/backerupper/mail for the configuration data.  In the configuration file I have setup BACKUP_DIR and TMP_DIR similar to main, but here I'm using zip's -u update feature.  Now let me explain why I have it setup this way since normally BackerUpper is supposed to run just after midnight, not before.

        I'm using the USE_ZIP_UPDATE feature to save a substantial amount of space on the backups.  Most people will get some e-mail every day, either good or spam - so the file will change every did.  If I did a normal incremental backup everyday, then each day's incremental backup would contain a copy of the mail spool.  Very unnecessary when all I'm interested in is the very last backup.  So I use the USE_ZIP_UPDATE and this way the incremental backup file is just updated each day.

        I also use the -i option and the timing of the cron job for a specific reason.  There is a slight race condition that can occur if you don't take this approach.  Normally BackerUpper will backup files that have a last modified of the day before, that's why it's best to run the program just after midnight.  The race condition comes in if someone receives an e-mail between midnight and the time BackerUpper is run.  Imagine it's March 25th and it's 1 minute past midnight and a new e-mail arrives.  It is written to the e-mail spool and the date of the file is changed to March 25th, the date of the last update.  BackerUpper is then run and determines that incremental backups should take place and any file with a date of March 24th (the day before) should be backed-up.  The e-mail spool won't be copied because of the new e-mail that arrived.  The -i option on the other hand tells BackerUpper to perform an incremental backup on any files that have changed today.  Since I'm running it just before midnight (March 24th) I will get a backup of any files that have changed today.


        If you have a certain situation and you aren't sure how to setup the program, let me know and we can figure something out.  By using the right combination of options, configuration files, and timings for when the backups occur you can make sure you have a pretty uptadate backup.




Last modified: 7/22/2002