Bacula
Contents
Overview
There are 4 main components in Bacula:
1. Director: The main server that controls all the backup operations. His configuration file is usually stored in /etc/bacula/bacula-dir.conf. 2. Storage Daemon: The guy who controls to write into the media that is going to write the backups. The Storage Daemon can be installed in the same Bacula Director. His configuration files is in /etc/bacula/bacula-sd.conf. 3. File Daemon: In few words, this is just the client that we want to run the backup. He has to be able to communicate with the Director and the Storage Daemon. 4. Console or Bacula Administration Tool, aka bat. It's the tool that we are going to use to administer bacula insteaad of the director bconsole program. He needs to communicate with the bacula Director.
Director Configuration
Basically a Backup for a client consists on a job. A job is set of "client definition", "File Set", "Schedulle" and "Storage".
A director has to start with a director directive where we define ports, paths and the password that we need to configure in any console or Bacula Administration Tool to control the director.
Director { # define myself Name = bacula-dir DIRport = 9101 # where we listen for UA connections QueryFile = "/etc/bacula/scripts/query.sql" WorkingDirectory = "/var/lib/bacula" PidDirectory = "/var/run/bacula" Maximum Concurrent Jobs = 1 Password = "sWaZRqTS1VX2uacyTZNXlI4Ax9Oj5MK1ANC62QBta1" # Console password Messages = Daemon }
From here we can start configuring Jobs (or what is the same configuring a backup).
Here we define a Job Name, and we announce which Client and File Set we are going to use. We also tell the job to use some defaults.
Job { Name = "windows-xp-01" Client = "windows-xp-01" Fileset = "Full Windows" JobDefs = "DefaultJob" Write Bootstrap = "/var/lib/bacula/Client2.bsr" }
These are the defaults that we are going to use for this set of Jobs:
We define here a default Level to backup if nothing is specified. We define here a default File Set to backup if nothing is specified. We define here an Schedulle And the default Pools we are going to use to store different Full/Incremental/Diferential backups.
JobDefs { Name = "DefaultJob" Type = Backup Level = Incremental Client = bacula-fd FileSet = "Full Set" Schedule = "10minutes" Storage = File Messages = Standard Pool = Default Priority = 10 Full Backup Pool = Full-Pool Incremental Backup Pool = Inc-Pool Differential Backup Pool = Diff-Pool }
So let's analize the important things here: FileSet, Schedule, Storage and Pools
FileSet is the set of files we want to backup:
This is an example of file Set, the most inportant thing here is to specify the Files to Include and the Files to exclude.
FileSet { Name = "Full Windows" Include { Options { signature = MD5 } File = "C:/Documents and Settings/laura/Desktop" File = "C:/Documents and Settings/laura/My Documents" File = "C:/Documents and Settings/laura/Local Settings/Application Data/Microsoft" } # # If you backup the root directory, the following two excluded # files can be useful # Exclude { file = "C:/Documents and Settings/laura/Local Settings/Temp" } }
On The schedule we can say when to run the job. This makes an incremental backup every 10 minutes, a differential every hour and a full backup every 4 hours. This is only a test example for 10 minutes, this should not be applied on production. Instead we should use a daily incremental, a weekly differential and a monthly full.
Schedule { Name = "10minutes" Run = Incremental hourly at 0:15 Run = Incremental hourly at 0:25 Run = Incremental hourly at 0:35 Run = Incremental hourly at 0:45 Run = Incremental hourly at 0:55 Run = Differential hourly at 0:05 Run = Full daily at 0:05 Run = Full daily at 4:15 Run = Full daily at 8:15 Run = Full daily at 12:15 Run = Full daily at 16:15 Run = Full daily at 20:15 }
Instead, for production, we should use a daily incremental at 17:05, a weekly differential every sunday night and a monthly full the first sunday of the month.
Schedule { Name = "WeeklyCycle" Run = Full 1st sun at 23:05 Run = Differential 2nd-5th sun at 23:05 Run = Incremental mon-sat at 17:05 }
Finally we specify the Client. The most important thing here is the IP, the Catalog (where all operations are being logged to be able to restore the files). In this case is a MySQL database.
Fire retention is how long the files are going to be retained in the backup. Note that, the file records may actually be retained for a shorter period than you specify on this directive if you specify either a shorter Job Retention or a shorter Volume Retention period.
Auto Prune will automatically remove the files when the retention period is finished.
Client { Name = windows-xp-01 Address = 172.16.121.129 FDPort = 9102 Catalog = MySQL Password = "ARm7qyF3lJSBKJ9NRICUj+Ry3PRJ3EZN92EnilHFVb8/" # password for FileDaemon File Retention = 30 days # 30 days Job Retention = 6 months # six months AutoPrune = yes # Prune expired Jobs/Files }
Then we come to the most important part of the configuration of the Bacula Director, it's the definition of the Storage Daemon that is going to use to store the files of the job. We need to specify the password that we are going to use to contact the Storage Daemon and the device on the SD that we are going to use.
# Definition of file storage device Storage { Name = File Address = 172.16.121.138 # N.B. Use a fully qualified name here SDPort = 9103 Password = "sL0Pjc3Nl1Je8zgVW3u8+0cI+vdIl5XdizCLU0v5dotr" Device = FileStorage Media Type = File }
Then the pool, the pool is the set of media or volumes that we are going to use for a job. The important think here is the "Volume Retention" that specifies how long we are going to keep the data in the media. Recicle and Autoprune are set to yes, so the media is going to be recicled after all the jobs that are in a specified volume have expired the volume retention.
The media is going to be changed every 1000000 bytes and the media is going to be labeled as "Full-xxxx" if auto label is set up in the storage daemon. This is only an examble and should't be use in production.
Pool { Name = Full-Pool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 1 day Maximum Volume Bytes = 1000000 Label Format = Full- }
Insetaad on production a good backup pool for full would be 6 month retention and 100 Meg max volume:
Pool { Name = Full-Pool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6 months Maximum Volume Bytes = 100000000 Label Format = Full- }
Then we can configure to send alerts via email:
to be done
Storage Daemon Configuration
Storage Daemon configuration is stored in /etc/bacula/bacula-sd.conf
The SD (storage Daemon) can be placed in the same machine than the director or can be in another different machine. The SD has to see the clients as well.
So let's start the configurationg by defining the Storage Daemon:
Storage { # definition of myself Name = bacula-sd SDPort = 9103 # Director's port WorkingDirectory = "/var/lib/bacula" Pid Directory = "/var/run/bacula" Maximum Concurrent Jobs = 20 }
Then we need to define, which directors are allowed to communciate with the SD. Note that here we must specify the same password that we have in the Storage definition on the director configuration.
Director { Name = bacula-dir Password = "sL0Pjc3Nl1Je8zgVW3u8+0cI+vdIl5XdizCLU0v5dots" }
Finally we need to define the device. In this case we are going to backup to files on /var/bakula. When we reach the 'maximum volume size' (defined in the Director's Pool) and we can not recycle any other volume due to 'Volume Retention' then a new volume (file) will be created and auto-labeled with the Pool Label format
Device { Name = FileStorage Media Type = File Archive Device = /var/bakula LabelMedia = yes; # lets Bacula label unlabeled media Random Access = Yes; AutomaticMount = yes; # when device opened, read it RemovableMedia = no; AlwaysOpen = no; }
Client configuration or File Daemon Configuration
The client of Bacula is also known and File Daemon. There are versions for Windows an linux.
We need to make sure that we install the same version of client or bacula-fd than the director that we will communicate to.
In this set up we used the default Debian lenny director, and this is Bacula 2.4. So we need to install Bacula 2.4 on the clients.
For linux, the most easy thing is to search on the repositories for the appropiatte bacula client or bacula-fd.
aptitude search bacula-fd aptitude install bacula-fd
On Windows we can donwload the 2.4 installer from:
http://sourceforge.net/projects/bacula/files/Win32_64/2.4.5/winbacula-2.4.5.exe/download
The configuration is the same on linux or Windows.
The configuration file on linux is stored in: /etc/bacula/bacula-df.conf The configuration file on Windows is stored in: C:\Documents and Settings\All Users\Application Data\Bacula\bacula-df.conf
On Windows can also be found by going into "Start -> Run -> Bacula -> Configuration -> Edit Client Configuration"
The main thing here, si to define a File Daemon, with a Name that has to be same than the "client" configuration of the director.
FileDaemon { # this is me Name = windows-xp-01 FDport = 9102 # where we listen for the director WorkingDirectory = "C:\\Documents and Settings\\All Users\\Application Data\\Bacula\\Work" Pid Directory = "C:\\Documents and Settings\\All Users\\Application Data\\Bacula\\Work" #Plugin Directory = "C:\\Program Files\\Bacula\\bin\\fdplugins" Maximum Concurrent Jobs = 5 }
The second important thing is to say which Director is allowed to access the File Daemon (Client). We need to define the same password in the client configuration:
Director { Name = bacula-dir Password = "ARm7qyF3lJSBKJ9NRICUj+Ry3PRJ3EZN92EnilHFVb8/" }
= The Catalog
The catalog is the database where all the operations are stored, and allows to identify which volumes are being used for job. So it's important to keep this information in a safe place so, if we loose this information our backup is useless.
Bacula works with many Backup solutions. In my case I prefer to use, mysql cos it's easy to mantain with mysql-dump every night.
So to be able to recover the backup, we have to backup in a separate place the:
/etc/bacula/bacuda-dir.conf /etc/bacula/bacula-sd.conf A dump of the bacula database on the mysql
On debian is easy to choose which database we want to use:
i bacula-director-mysql MySQL storage for Director p bacula-director-pgsql PostgreSQL storage for Director p bacula-director-sqlite SQLite storage for Director c bacula-director-sqlite3
By default an schema of the bacula database will be created.
To definee how we connect we use the following credentials on the /etc/bacula/bacula-dir.conf:
Catalog { Name = MySQL dbname = bacula user = bacula password = "password" DB Address = 127.0.0.1 }
So to mantain a backup we only need to cron to an email by running
mysqldump -u bacula bacula > /tmp/bacula.mysql
Using the bat program to control Bacula
Bacula Administration Tool (bat) is the easiest way to run jobs (backup and restores) and check Bacula settings. We can not configure new jobs, this only can be done by editing the bacula-dir.conf.
bat only works on linux machines with graphical interface installed, even there is a client for Windows, the linux bat is more complete.
Installing
To install it on debian or Ubuntu:
aptitude install bacula-console-qt
There is a configuration file where we need to set up which director we want to connect to and the password to connect (the same as de Director configuration on the bacula-dir.conf).
# # Bacula Administration Tool (bat) configuration file # Director { Name = bacula-dir DIRport = 9101 address = 172.16.121.138 Password = "sWaZRqTS1VX2uacyTZNXlI4Ax9Oj5MK1ANC62QBta1" }
To run the bat we just need to run:
bat -c /etc/bacula/bat.conf
To run a backup
By default, the backups are run on a job, that is schedulled to run on a certain date/time. But, we can also force to run a manual backup.
To do this, just run bat from a linux with graphical user interface:
bat -c /etc/bacula/bat.conf
And from there, there is an icon with a blue wheel that says run job.
Then we can select a job, the client, the pool and the level of the backup we want to run. Please make sure that the Level, matches the Pool. On a job, the Level and the Pool is determined by the Schedule configuration, but in the case the fact that is manual, needs to have special attention on the pool that we need to use.
To restore only a file from a backup
Maybe we only want to restore one small file or folder, due to some user error deleted or corruption in media. To do this, just run bat from a linux with graphical user interface:
bat -c /etc/bacula/bat.conf
And from there we go to version browser, we select a job, we select the appropiate client, and file list to any, and we press the refresh button.
Then on the jobs window, we will see the jobs that have been running. we select the first one and we navigate to the appropiate file or folder we want to restore
then when we have selected everything we want to restore, we go to restore and we select the client where we want to restore to and the location. By default, I prefer to send it to a different location and not to overwrite.
So if I want to restore the
c:\Documents and Settings\user\Desktop
and I put on location (where) to restore the files,
c://bacula-restore/
it will create a folder on c:\bacula-restore with the appriate restore tree, so the restore will look like:
c:\bacula-restore\C\documents and Settings\user\Desktop
To restore all set of files
bat -c /etc/bacula/bat.conf
Go into console of bat and type restore.
A selection dialog will be prompted. An easy option is to use 'select the latest backup for a client' but we can also specify a 'select a backup of a client before a specified time'.
We select ok, and then we select on the next screen the client and the appripiate 'File Set'.
Then we can mark all the C drive and set to specify the location to restore, by default will overwirte anything, but we can also put some kind of c://bacula-restore/
We press OK
Using the Director to Query and Start Jobs
To communicate with the director and to query the state of Bacula or run jobs, from the top level directory, simply enter:
./bconsole
Type help to see a list of available commands:
*help Command Description ======= =========== add add media to a pool autodisplay autodisplay [on|off] -- console messages automount automount [on|off] -- after label cancel cancel [<jobid=nnn> | <job=name>] -- cancel a job create create DB Pool from resource delete delete [pool=<pool-name> | media volume=<volume-name>] disable disable <job=name> -- disable a job enable enable <job=name> -- enable a job estimate performs FileSet estimate, listing gives full listing exit exit = quit gui gui [on|off] -- non-interactive gui mode help print this command list list [pools | jobs | jobtotals | media <pool=pool-name> | files <jobid=nn>]; from catalog label label a tape llist full or long list like list command memory print current memory usage messages messages mount mount <storage-name> prune prune expired records from catalog purge purge records from catalog python python control commands quit quit query query catalog restore restore files relabel relabel a tape release release <storage-name> reload reload conf file run run <job-name> status status [[slots] storage | dir | client]=<name> setdebug sets debug level setip sets new client address -- if authorized show show (resource records) [jobs | pools | ... | all] sqlquery use SQL to query catalog time print current time trace turn on/off trace to file unmount unmount <storage-name> umount umount <storage-name> for old-time Unix guys update update Volume, Pool or slots use use catalog xxx var does variable expansion version print Director version wait wait until no jobs are running [<jobname=name> | <jobid=nnn> | <ujobid=complete_name>]
We can see the configuration of the director by using any show...
*show help Keywords for the show command are: directors clients counters devices jobs storages catalogs schedules filesets pools messages all help
File Set Examples
The following example shows how to back up the My Documents, the Desktop directory, and the Outlook PST files for all users in C:/Documents and Settings, i.e. everything matching the pattern
To understand how this can be achieved, there are two important points to remember:
Firstly, Bacula walks over the filesystem depth-first starting from the "File =" lines. It stops descending when a directory is excluded, so you must include all ancestor directories of each directory containing files to be included.
Secondly, each directory and file is compared to the Options clauses in the order they appear in the FileSet. When a match is found, no further clauses are compared and the directory or file is either included or excluded.
The FileSet resource definition below implements this. First excludes any file that is mp3, exe, iso...
Then, we specify the folders that we allow to check by including especifc directories and files, and finally excluding everything else.
FileSet { Name = "Full Windows2" Include { File = "C:/Documents and Settings" Options { ignoreCase = yes Exclude = yes WildFile = "*.tmp" WildFile = "*.mp3" WildFile = "*.mpeg" WildFile = "*.wma" WildFile = "*.exe" WildFile = "*.cda" WildFile = "*.avi" WildFile = "*.iso" WildFile = "*.lnk" WildFile = "*.vmdk" WildFile = "*.vmem" WildFile = "*.vmsn" } Options { signature = SHA1 verify = s1 IgnoreCase = yes RegExDir = "^C:/Documents and Settings/[^/]+$" WildDir = "C:/Documents and Settings/*/My Documents" Wild = "C:/Documents and Settings/*/My Documents/*" WildDir = "C:/Documents and Settings/*/Desktop" Wild = "C:/Documents and Settings/*/Desktop/*" WildDir = "C:/Documents and Settings/*/Local Settings" WildDir = "C:/Documents and Settings/*/Local Settings/Application Data" WildDir = "C:/Documents and Settings/*/Local Settings/Application Data/Microsoft" WildDir = "C:/Documents and Settings/*/Local Settings/Application Data/Microsoft/Outlook" Wild = "C:/Documents and Settings/*/Local Settings/Application Data/Microsoft/Outlook/*" } Options { ignoreCase = yes Exclude = yes Wild = "C:/Documents and Settings/*" } } }