README
- Name
- Version
- Synopsis
- Description
- Prerequisites
- Motivation
- How it works
- Example
- Backup steps
- Task work-daily
- Task work-weekly
- Remark
- Backup repositories
- Daily rotation
- Daily backup
- Weekly rotation
- Weekly backup
- Installation
- Configuration
- Configuration file
- Global section
- Paths to commands
- parameter: tasks
- parameter: exclude_file (optional)
- parameter: rsync_options (optional)
- Task sections
- parameter: mode
- parameter: source
- parameter: destination
- parameter: rotate
- parameter: rsync_options ( optional )
- parameter: exclude_file ( optional )
- Exclude files
- Usage
- Changes
- References
- Availability
- Author
Name
rsback -- Program to backup file trees in rotating archives on Unix-based hosts
Version
$Id: rsback,v 0.4.2 2002/10/26 00:04:02 hjb Exp $
Synopsis
To start one ore more backup tasks:
rsback [options] list-of-tasks
To get help:
rsback -h
Description
rsback makes rotating backups using the common rsync program (http://rsync.samba.org) and some standard file utilities on Unix-based backup hosts. Its purpose is to mirror certain file trees from a remote host or from the local system and to store them as rotating archives in backup repositories on the local backup host. The file structure, permissions, ownerships and time stamps of the mirrored data are the same as in the original sources.
rsback is a kind of front end to rsync written in Perl which allows a system administrator to configure and excute backups of different file trees located on remote hosts or on the local system (e.g tasks for hourly, daily, weekly, monthly, ... backups).
If rsback is executed at regular intervals (preferably scheduled by cron jobs), it maintains rotating backup archives. To restore files from the backup repository no special restore procedure is necessary. To recover files or directories, you just copy them from the archive tree back to the original location or wherever you want to place them.
The combination of rsync's powerful capabilities and the extensive use of hard links for copying archives within the local file system results in a fast and disk space saving backup technique.
Prerequisites
rsback runs on Unix-based hosts. I tested it on some Linux boxes running RedHat 6.x and 7.x. It should also run on other Unix-based systems if the following programs and utilites are installed:
- rsync: I recommend the most recent version [1]. If you want to mirror file trees from remote Windows boxes you also need Cygwin [2].
- Perl 5.005 (or higher version) [3]
- The common file utilities cp, rm, mv, and mkdir [4]
- cron or similar program to execute scheduled commands (recommended)
- You should have some knowledge about rsync [5]
- You need root privileges to install and run rsback on a backup host
Motivation
I was looking for a backup solution suitable for a workgroup server (Linux box) where some project folders (10 to 20 Gigabytes) have to be mirrored daily.
For a while I tried several different backup techniques. But I was not really happy with any of them.
By accident I found rsync on my disk and tried to find out what it could be used for ... looks good ;)
Searching for a ready-to-use backup solution based on rsync in the net I found Mike Rubel's examples of rotating rsync snapshots [6]. I seemed to be a solution for my problem. To handle configurations of different baskup tasks more comfortably I finally made a kind of front end or wrapper based on Mike's sample scripts:
The result was rsback (RSync BACKup ... hmm)
How it works
Example
The explanations below will refer to a typical example like this:
We want to maintain a rotating backup repository of a file
tree which resides on a remote host workbox
. The
remote host runs rsync in daemon mode on TCP port 873.
The file tree on workbox
consists of all
subdirectrories and files of /var/projects
which
is accessible via workbox::work
. The corresponding
entry in the rsyncd
configuration file
workbox:/etc/rsyncd.conf
may look like this (the
most simple case):
[work] path = /var/projects comment = project directories
Backup steps
The backup concept of rsback is based on two steps:
The repetitive combination of rotation and backup results in backup archives which are comparable to classic combinations of full and incremental backups with respect to the content of the archives.
Task work-daily
All files and directories under workbox::work
(or workbox:/var/projects
respectively) should be
saved every workday night to our local machine
backbox
. The five latest daily backup sets should
be kept in the backup repository on backbox
.
Task work-weekly
Additionally, a weekly backup of the most recent local
archive should be made on saturday. The four latest weekly
backup sets should also be kept in the repository on
backbox
.
That way we should have the data of the last five working days and the weekly shnapshots of the last four weeks (taken every Friday) in our backup repository.
Remark
Tasks are not restricted to be processed at daily or weekly intervals as in this example. It's up to you how often you perform backups and how many archives you keep in your repositories.
Backup repositories
Let us assume that our local host backbox
has a
large disk which is mounted to /backup
. The
directory /backup
will hold our local backup
repositories.
Archive structure
The backup repository on backbox
in our example
looks like this:
/backup +--/work | history.work-daily history of task work-daily | history.work-weekly history of task work-weekly | +--/daily.0 most recent daily archive tree | +--/daily.1 \ | +--/daily.2 | | +--/daily.3 |-previous daily archives | +--/daily.4 | | +--/daily.5 / | +--/weekly.0 most recent weekly archive tree | +--/weekly.1 \ | +--/weekly.2 |-previous weekly archives | +--/weekly.3 | | +--/weekly.4 /
The directories ../daily.0
to
../daily.5
contain copies of the original data of
the most recent daily backup run (daily.0
), of the
backup run one day before (daily.1
), ..., and of
the backup run five days ago (daily.5
)
respectively. The directories ../weekly.0
to
../weekly.4
are the archives of the most recent
weekly tasks and of the previous weekly tasks,
respectively.
History file
A history file for each backup task keeps track of the time stamps of the archives. A History file consits of a table of two (tab separated) columns. For each consecutive backup run there is a row with the backup number in column one and the date and time in ISO format in column two:
# rsback-0.4.0 (hjb -- 2002-07-16) 0 2002-07-17 22:24:05 1 2002-07-16 22:24:13 2 2002-07-15 22:24:30 3 2002-07-12 22:25:28 4 2002-07-11 22:24:20 5 2002-07-10 22:24:16 6 2002-07-09 20:15:37
The history file is read before a backup task is processed. If no history file exists it will be created using the time stamps of the existing archive tree (if there is any). After the backup task has finished, the recent history will be written to the history file.
Daily rotation
When a backup task is executed, first the previous backup archives in the repository are rotated by hard-linking the archives among themselves. In our example:
rm -rf daily.5 mv -al daily.4 daily.5 mv -al daily.3 daily.4 mv -al daily.2 daily.3 mv -al daily.1 daily.2
The backup set daily.1
is replaced by hard
links to the most recent backup set daily.0
:
cp -al daily.0 daily.1
Daily backup
Using rsync the source tree is mirrored from a remote or local file system to the local backup repository. The default behaviour is, that only files and directories are copied which are different from their couterparts in the backup repository. Different means: the size, time stamp, or ownership of a file/directory has changed since the last backup to the same repository, or a file/directory doesn't (yet) exist in the repository. Items in the backup repository, which do not exist in the source tree, are removed from the backup repository.
This action is launched by invoking rsync like
rsync -al --delete <source> <destination>
Weekly rotation
This is done in same manner as the daily rotation, execpt that (in our example) the archives from weekly.0 to weekly.4 are rotated.
Weekly backup
We want to make a snapshot of the most recent daily backup archive in our backp repository. Both the source and the destination are local directories. Therefore this backup executed by hard-linking daily.0 to weekly.0:
rm -rf weekly.0 cp -alf daily.0 weekly.0
Installation
To install rsback on a backup host, login as root and proceed as follows.
Copy the downloaded archive rsback-x.y.z.tar.gz
(x.y.z
is the actual version) to a installation
directory, e.g. /usr/local/src
. Change to this
directory and unpack the archive:
# cd /usr/local/src # tar zxvf rsback-x.y.z.tar.gz
Copy rsback
to a bin directory in
root's
path, e.g.
# cp rsback-x.y.z/bin/rsback /root/bin
Be sure that rsback
is executable only by
root:
# chmod 700 /root/bin/rsback
Create a configuration directory and copy the sample
configuration files from ../rsback-x.y.z/etc
into
it:
# mkdir /etc/rsback # cp rsback-x.y.z/etc/* /etc/rsback
Be sure that only root has access to
rsback.conf
:
# chown root.root /etc/rsback/rsback.conf # chmod 600 /etc/rsback/rsback.conf
Now you may delete the archive:
# rm rsback-x.y.z.tar.gz
Configuration
Some configuration parameters will just be passed as options
to rsync. Therefore it is strongly recommended that
you consult the rsync documentation
[5] and the man pages
(rsync(1)
, rsyncd.conf(5)
), if you
are not sure, what rsback does.
Before you run your configuration with production data, make
some tests with dummy data first. Compare the results carefully
with that, what you have expected.
You should consider some general precautions, if your machines can be accessed by more people than only you.
- Don't allow data transfers to and from remote machines without authentication or other access restrictions.
- Don't transfer clear text passwords.
- Don't transfer unencrypted sensible data.
- Don't give write access to the backup repositories to anyone else than root@backupbox.
- Don't give read access to backup repositories to anyone else than the owner of the original data.
-
Don't give read or even write access to
rsback.conf
to anybody else than root@backupbox:# chown root.root /etc/rsback/rsback.conf # chmod 600 /etc/rsback/rsback.conf
Configuration file
Edit rsback.conf
to customize
rsback and to define your backup
tasks.
Default location
If you want to have the default configuration file somewhere
else than /etc/rsback/rsback.conf
, edit the
variable $rsback_conf
in rsback
to
match your preferences. Or use option -c to tell
rsback where to find the
configuration file (see Usage).
File format
The file format is similar to that of
rsyncd.conf(5)
.
The file is line-based - that is, each newline-terminated
line represents either a comment, a section name or a
parameter. Any line beginning with a hash #
or a
semicolon ;
is ignored, as are lines containing
only whitespace. The file consists of sections and parameters.
A section begins with the name of the section in square
brackets and continues until the next section begins. Sections
contain parameters of the form name =
list-of-values
, where list-of-values
is a
list of one or more strings.
Global section
In the section [global]
some general
configuration parameters are defined. If not noted explicitly
as optional, all parameters are mandatory.
Paths to commands
rsback needs to know where to find
some programs. Set the paths with the parameters
rsync_cmd
, cp_cmd
,
mv_cmd
, rm_cmd
, and
mkdir_cmd
according to your system. The default
settings in the sample configuration file comming with
rsback are
- parameter:
rsync_cmd
rsync_cmd = /usr/bin/rsync
- parameter:
cp_cmd
cp_cmd = /bin/cp
- parameter:
mv_cmd
mv_cmd = /bin/mv
- parameter:
rm_cmd
rm_cmd = /bin/rm
- parameter:
mkdir_cmd
mkdir_cmd = /bin/mkdir
parameter: tasks
tasks
is a list of all backup tasks you want to
execute. A back up task in this context is just a arbitrary
word to denote a certain backup job. The specific parameters of
each backup task listed in tasks
have to be
defined in a separate task
section.
parameter: exclude_file (optional)
exclude_file
points to a file containing global
exclude patterns for rsync. 'global' means: these
patterns are applied to all backup tasks wich are excuted with
mode=rsync
(see task_sections). Please refer to the rsync documentation (look for ``exclude
patterns'') or to the man page (rsync(1)
). The
value given here will be passed to rsync with the
command option --exclude-from
as it is.
parameter: rsync_options (optional)
The optional parameter rsync_options
defines
additional options which will be passed to rsync. For
example you my choose
rsync_options = --stats
to tell rsync to report some statistics on the file transfer. This parameter applies to all backup tasks. You can also define additional options which will only applied to certain tasks within the task sections.
Task sections
Parameters specific to certain backup tasks are declared
within corresponding task sections. There should be one task
section for each backup task listed with the global parameter
tasks
(see global
section). E.g., if you have declared
[global] tasks = work-daily work-weekly misc
the task sections
[work-daily] . . [work-weekly] . . [misc] . .
must be present.
parameter: mode
This parameter controls what backup mode will be used for
execution of this task. Use mode=rsync
, if you
want to backup the original source tree either from a remote
host or form the local machine using rsync.
mode=link
is intended to be used for local
copies on the backup host. This makes sense only, if both the
source and the destination reside on the same physical partion,
because hard links will be used.
parameter: source
source
designates the location of the source
data to be saved. The format depends on the backup mode and the
loaction of the source files. This parameter will be passed as
source to rsync if mode=rsync
is selected
or to cp
if mode=link
is selected.
Please refer to the man pages rsync(1)
and
cp(1)
to select the right one for your
purpose.
E.g. if the source data resides on the remote host
workbox
which is running rsync in daemon
mode (as in our example above) then source is something like
this
source = workbox::work/
If mode=link
the parameter source
designates the source directory on the local host. The task
work-weekly
in our example above needs a line
like
source = /backup/work/daily.0
in its task section.
parameter: destination
destination
is the directory within the local
backup repository. It is not a bad idea to use directory names
in the destination path which can easily be related to a backup
task (or vice versa). E.g. if we refer to the task
work-daily
of our example then it is something
like
destination = /backup/work
The definition for the task work-weekly
or our
example is also
destination = /backup/work
This may be confusing, but consider, that the final archive
directory will always be a subdirectory of this path, named
according to your selection in the first rotate
parameter (see below).
parameter: rotate
This parameter consists of a list of two values: the first value is an arbitrary name to designate the archive directory in the local depository. The second value is an positive integer number, which defines how many backup sets have to be kept in the repository.
Example:
rotate = daily 5
parameter: rsync_options ( optional )
Same as parameter rsync_options in
the [global]
section, but applies only to this
task.
parameter: exclude_file ( optional )
This parameter has the same purpose as in the global section. The only difference is, that it is applied to this task only (see also below).
Example:
exclude_file = /etc/rsback/work-daily.exclude
Exclude files
Patterns to exclude files or directories from beeing rsync'd
are collected in separate files, see parameter
exclude_file
above. Because these exclude files
are directly passed to rsync with the option
--exclude-from=FILE
they must have a format as
rsync wants to see. Please consult the section
``EXCLUDE PATTERNS'' in rsync(1)
.
Global and task specific exclude files are cumulative: both the exclude patterns in the global exclude file and the patterns in the exclude file defined in a task section will be applied to the source tree when a backup task is processed.
Usage
Starting backup tasks
To start a backup task invoke
# rsback [options] task-list
where task-list
is a list of one or more backup
tasks as definded in the configuration file.
The possible options are
- -h
- Display a help message (usage)
- -v
- Be verbose
- -d
- Run rsync with option
--dry-mode
(simulation mode). That means: rsync does not copy anything, it just displays what it would do. - -i
- Initialize the backup repositories to be used for the specified tasks. This isn't really necessary, because rsback will try do create the necessary directories, if a backup repository does not yet exist, when a backup task is processed.
- -c
configuration-file
- If you want to use a configuration file other than the default one, use this option to tell it rsback.
Example:
rsback -vc /etc/rsback/test.conf work-daily misc
Scheduling backup tasks
rsback is supposed to be executed by cron jobs at regular intervals. crontab entries in our example may look like
0 22 * * 1-5 /root/bin/rsback -v work-daily >>/var/log/rsback/work-daily.log 0 22 * * 6 /root/bin/rsback -v work-weekly >>/var/log/rsback/work-weekly.log
The daily backup task work-daily
will be
executed every workday night at 22:00. The weekly backup task
work-weekly
is processed at Saturday night.
Changes
see CHANGELOG
References
- rsync: http://rsync.samba.org
- Cygwin: http://cygwin.com
- Perl: http://www.perl.com
- file utilities: http://www.gnu.org/software/fileutils/fileutils.html
- rsync documentation: http://rsync.samba.org/documentation.html
- Mike Rubel's examples of ``rotating rsync snapshots'': http://www.mikerubel.org/computers/rsync_snapshots
Availability
http://www.pollux.franken.de/hjb/rsback
Author
Copyright (C) 2002 by Hans-Juergen Beie <hjb@pollux.franken.de>
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.