Destinations

After reading this page, see links at the bottom for details on each destination type.
 
HashBackup uses a local backup directory to store backup metadata (file names, file sizes, etc) in an encrypted database, hb.db. The local backup directory is specified by -c, or ~/hashbackup is used if -c is omitted.  The encryption key is stored in key.conf in the backup directory.  These files are created by the init command.

The backup command updates the database and creates one or more archive files in the backup directory, named arc.v.n, where v is the backup number and n is a counter starting at zero.  The default size for arc files is 100MB, controlled by config option arc-size-limit.

There are 3 basic storage setups for HashBackup:

1. backup data is kept only in the local backup directory
2. backup data is kept locally and sent to remote storage
3. backup is kept mostly on remote storage with a local cache

In the default configuration, backup data is kept only in the local backup directory.  If that's your goal, you're done and don't need to read on.

You may want to keep a copy of your backup locally, but not inside the local backup directory.  For example, your backup directory may be on a local, fast SSD for good database performance, but you want backup data written to a directory on an NFS server.  This requires setting up a Dir destination with instructions below.  It's setup like remote storage but accessed through an ordinary mounted directory.

Backup to Remote Storage

It's a good idea to send backup data to one or more remote storage destinations to protect against loss of the local backup directory from theft or an on-site disaster.  Remote backup storage is setup by creating a dest.conf file in the local backup directory.  Backup data is sent to all remote storage services listed in dest.conf using multiple worker processes for each destination.  Transfers occur while the backup is running to minimize total backup time.

Keeping a Local Copy of Backup Data

When backup data is sent to remote storage, you have the option to keep a complete copy in the local backup directory, or only a partial cache.  Keeping the backup data local makes it much easier for HashBackup to manage remote storage, minimizes your remote storage costs, eliminates "stalls" during backup because of slower remote transfers, and makes restores much quicker.

Keeping a Local Cache of Backup Data

If it is not feasible to keep a complete copy of your backup in the local backup directory, HashBackup can also operate with a partial cache of backup data in the local backup directory.  One config option, cache-size-limit, controls the size of this local cache.  The default is -1, meaning to keep a copy of all backup data in the local backup directory.  Setting cache-size-limit to 5GB will limit the size of local backup data to 5GB.  It is recommended to set cache-size-limit as high as reasonable, because keeping more backup data locally allows HashBackup to better optimize remote storage costs and prevents backup "stalls" if files cannot be transferred to remote storage as fast as backup creates new arc files.  The setting is easily changed at any time, so this can be decided later.

The dest.conf file

The dest.conf text file describes a list of destinations, usually offsite, to receive copies of the backup.  The dest.conf file is created in the local backup directory (the -c directory) with a text editor, using these notes as a guide.  The dest.conf file is setup the same way whether you plan to keep a complete local copy or a partial cache.

IMPORTANT SECURITY PRECAUTIONS

1. Because the key.conf and dest.conf files contain password and key information, they are never copied to any remote destination.  Therefore, the HashBackup executable, key.conf, and dest.conf files should be copied to a safe place (or several safe places) in case your backup drive becomes inaccessible and you lose these critical files.

2. When HB creates the key.conf file, it sets permissions to read-only for the owner, with no rights for everyone else.  It is important for dest.conf to also have restrictive permissions because it contains passwords necessary to access remote services.  You can do this with the chown and chmod commands:

  $ chown root dest.conf    # or whatever id runs HashBackup
  $ chmod 700 dest.conf

General Concepts

As backup runs, the backup files created (arc files) are copied to every destination.  You can specify more than one destination in dest.conf.  Destinations should be listed in the order you want them used for a restore.  So for example, a local FTP server should be listed before Amazon S3.  The same type of destination can also be specified more than once, for example, two FTP servers could be listed.  Each destination must have a unique destname keyword, as this is how HashBackup tracks which files are on which destination.  Creating two destinations for the same physical server or remote storage service is fine so long as the storage itself does not overlap (other keywords control this).  HashBackup manages unique ID tags for each destination to prevent accidental overwriting of backup storage.

Unavailable or Failing Destinations

If a destination is not available for some reason during a backup, for example, a USB drive is not plugged in, an error is displayed and files will not be copied there.  The next time you do a backup and the destination is available, any missing or updated files will be copied to "catch up".  Review the onfail keyword in this situation.  If you have setup a limited cache with cache-size-limit, a failed destination may cause the local cache to fill up, which then causes backup to halt.

Adding Destinations to an Existing Backup

If you add a new destination to dest.conf, the next backup command will copy all backup data to the new destination, including data from previous backups.  If not all backup data is stored locally, because cache-size-limit is set, old backup data may have to be fetched first from remote destinations before it can be copied to new destinations.  The actual backup is stalled until the remote-to-remote copy is finished.  If cache-size-limit is -1 (all data kept in the backup directory), this "catch up" synchronization will occur during the backup.

Removing a Destination

To remove a destination (you no longer want data there), first use this command to clear all files from the destination:
   $ hb dest -c backupdir clear destname
Then immediately remove the destination's info from the dest.conf file, or add the off keyword. If you run another hb command before editing dest.conf, HB may try to copy all the files back to the destination you just cleared.

Creating dest.conf

To setup remote destinations, use a text editor to copy example destinations to the dest.conf file in your -c directory.  Keep in mind that HashBackup downloads files from destinations based on their order in dest.conf, with the first destination having the highest download priority.

Conventions

The dest.conf text file has lines with a keyword and value.  The keyword is case-independent.  Comment lines begin with a # and are ignored, as are blank lines.  As explained below, some keywords are common to all destinations, some keywords are unique to only certain types of destinations, some keywords are required, and some are optional.

Keywords For All Destinations

destname
This keyword begins a new destination.  HB tracks destination contents using only this name.  Because of this, it is possible to create a "seed" backup to a USB hard drive plugged in to your computer, take that drive to a remote site, and change the other keywords to switch from a Dir type (local directory) to FTP for example.

IMPORTANT NOTE: do not change destname once you have backed up files.  If you do this with a limited cache (cache-size-limit is set >= 0), your backup will immediately become inaccessible, and the next backup command will fail because it tries to synchronize archives.  Change the name back to what it was when you made the backups.  If you change destname with cache-size-limit set to -1, which is the default, it will cause all of your previous archives to be uploaded again during the next backup.

off
Add this line to dest.conf to disable a destination.  To enable it, remove this line or comment it out, like #off.

type
Specifies the HB driver used to access the destination.  Every destination must have the type keyword.  Use the links at the bottom of the page for details and examples, but read this entire page first.

b2    - Backblaze B2
cf    - RackSpace Cloudfiles
dav   - WebDAV
dir   - a directory (could be local, remote, USB stick, etc)
do    - Dreamhost Dream Objects (S3 clone)
ftp   - FTP server
ftps  - FTP with TLS (SSL); userid, password & commands are encrypted
gmail - gmail email account (automatically sets server & port)
gs    - Google Storage (S3 clone)
imap  - imap (email) server
os    - OpenStack
rsync - rsync server
s3    - Amazon S3
shell - user-written driver, including Rclone
ssh   - sftp server over ssh

workers
Specifies the number of worker processes for a destination.  The default of 2 is usually fine, but you may want to specify 1 worker to decrease HB's network load, or may want to use more workers if you have a fast connection and want to transfer a lot of data quickly.

Setting workers too high is usually counterproductive, so it's recommended to increase it by 2 or 4, then measure your throughput to make sure it is actually beneficial.

Setting workers to 1 is recommended when debug is enabled.  Otherwise, workers' debug output overlaps and gets very confusing.  Setting workers to 1 may help prevent memory allocation problems in very low memory environments.

retry
Specifies the number of retries, the initial delay before a retry, and the delay factor for each retry.  If omitted, destinations will retry 8 times on errors (9 times altogether), delaying 5 seconds the first time, then multiplying the delay by 2 for each retry.  This is equivalent to retry 8,5,2 and gives a total retry period of around 20 minutes.  The delay between retries is limited to 20 minutes, so retry 10 retries for a total of about an hour.  Add 3 retries for every additional hour: retry 13 would retry for 2 hours, etc.  Up to 3 integers can be specified, with the defaults used for missing values.  Some destination types have an internal retry loop, usually only a minute or two, that may increase the total retry time.

debug
This keyword takes an integer value, with higher numbers usually causing more debug output.  It can be useful when a destination is "acting up", but should not be used in production.  The special value 99 means that an error in the destination will cause a traceback rather than a retry.  This can be used to track down the cause of a difficult error.  It's advisable to set workers to 1 when enabling debug output. If workers is > 1, workers' debug output is interwoven and hard to follow.

rate
Specifies the maximum upload rate (outgoing bandwidth) for each worker, in bytes per second.  The minimum rate is 1024, since lower rates are probably typos.  A suffix can be used, for example, 500k or 500Kb.  If the rate keyword is not used (the default), there is no upload rate limit.  Setting workers to 1 may be useful when limiting upload rates.  If this is not done, a rate of 500k means each of the n workers can upload at this rate, so if they are all active, the max upload rate would be n times 500k.  If you have a low speed upload connection and want your network to be usable during uploads, set the rate at 25% of your max upload speed, run some tests, try 50% of your max, etc.

A better alternative to using the rate keyword is to setup QOS (Quality Of Service) on your router.  By telling your router your maximum upload bandwidth, it is able to allocate upload bandwidth fairly among all active connections.  The advantage is that if HB is the only process using the network, it can upload at full speed.  If other network connections become active, the router will make sure that each connections gets a share of the upload bandwidth.

maxsize
While the arc-size-limit config keyword can be used to limit the size of archive files, there is no way to limit the size of the hb.db.n files HB might create.  Some destinations, eg, imap or WebDAV servers, have small limits like 25MB per file, and larger files cannot be uploaded.

The maxsize keyword can be used on a destination to put a hard limit on the size of files uploaded to that destination.  Any file exceeding this limit will be split into parts and each part uploaded as a file.  On retrieval, the parts are fetched and reassembled to create the original file.  The value can be an integer, meaning bytes, or can be a number (with optional decimal point) with a suffix like K, M, G, T, P, with optional B.  So 1.5KB means 1.5 times 1024, or 1536.

Ideally, the config option arc-size-limit would be smaller than maxsize to avoid splitting arc files into parts.  arc-size-limit may need to be quite a bit smaller than maxsize to avoid splitting, because backup might create arc files larger than arc-size-limit.  If the hard limit is 128MB for example, arc-size-limit should be set to something like 120MB to avoid splitting arc files unnecessarily.

onfail
If this keyword is not present on a destination and it fails, the backup will continue, an error will be counted, and the exit status will be non-zero.

If onfail ignore is used and a destination fails, the backup will continue, no error will be counted, and the exit status is not affected.  This is useful with two destinations used in rotation, for example, two USB drives are used but only one is present during the backup.

If onfail stop is used and a destination fails, the backup will stop immediately and the exit status will be non-zero.  This can be used when it is critical that the backup data is sent to the remote destination.

randfail (testing only)
This keyword can be used to simulate remote failures.  The value is an integer 0-100 representing the percentage of requests that should fail: 25 means 1 out of 4 requests will fail, 50 means 1 of 2 will fail, 75 means 3 of 4 will fail, 100 means every request will fail.  Simulated failures do not generate remote traffic.  Destination threads will stop if all requests fail for one file.  Randfail is for testing HB's error recovery and should not be used in normal operation.

randwait (testing only)
This keyword can be used to simulate remote delays.  The value is an integer representing the maximum delay in seconds for a request.  A message is printed with the actual delay.  The delay occurs all at once at the beginning of the request.  The rate keyword can be used to simulate delays throughout a request.  Randwait is for testing HB and should not be used in normal operation.

timeout
Some destination types have a timeout setting, and this keyword sets the timeout.  The default is 300 (seconds), or 5 minutes.  Destinations can override this default.  For example, the default timeout for the rsync destination is 3600 (1 hour), because extremely slow rsync servers, like an NSLU2, may take a long time to do a checksum verification.

Destination-Specific Keywords

Each type of destination has keywords peculiar to that type.  These are documented in the examples for each destination type, listed in the links below.