Config

Sets and displays configuration options. A history of all options used for each backup is stored in the backup database.

$ hb config [-c backupdir] [-r version] [key] [value]

Examples

Display current configuration for the next backup:
$ hb config -c backupdir

Display the configuration used for backup version 5:
$ hb config -c backupdir -r5

Set arc-size-limit to 1gb (used for the next backup):
$ hb config -c backupdir arc-size-limit 1gb

Display arc-size-limit:
$ hb config -c backupdir arc-size-limit

Display arc-size-limit used for backup version 5:
$ hb config -c backupdir -r5 arc-size-limit

Config changes take effect on the next backup. To send config options to remotes immediately, use:
$ hb dest -c backupdir sync

Config Options (Keywords)

admin-passphrase - an admin passphrase can restrict certain actions unless the passphrase is entered. This option is used to protect the config data from changes, to protect dest.conf data stored in the database (see Dest command), and with the enable-commands and disable-commands options. The new admin passphrase cannot be listed on the command line. Instead, wait for the prompt to enter the new passphrase twice. After an admin passphrase is set, the environment variable HB_ADMIN_PASSPHRASE can be used to pass the admin passphrase to HashBackup for automated operations.

It is possible to specify a new admin passphrase with environment
variables, for example, to set a passphrase on a backup that doesn't
currently have one, use:
    $ HB_ADMIN_PASSPHRASE=oldp hb config -c backupdir admin-passphrase
To change a passphrase use:
    $ HB_ADMIN_PASSPHRASE=oldp HB_NEW_ADMIN_PASSPHRASE=newp hb config -c backupdir admin-passphrase
WARNING: environment variables are useful for automation but may
         have security risks that could expose sensitive data to
         other locally running processes.

IMPORTANT: do not put the admin passphrase on a command line as it can be viewed by other processes.

arc-size-limit - specifies the maximum size of archive files. The default is 100mb. The minimum size is 10k, used for internal testing. HashBackup may go slightly over the limit specified, so if you need a hard limit for some reason, specify 5-10MB less. At least 2 x arc-size-limit bytes of free disk space will be required in the local backup directory.

For multi-TB backups with mostly large files that do not change often, a large archive size may be more efficient. A practical limit is around 4GB because many online storage services do not allow file sizes over 5GB without special handling, and a limit higher than 4GB will double RAM usage when creating restore plans if cache-size-limit is set because 64-bit file offsets are required. Be aware that some destinations have a maximum upload file size.

For VM image backups with a lot of dedup, a high rate of change, and/or small block sizes, a smaller arc size may be more efficient because it can be packed more often. A smaller arc size may reduce disk space cache requirements during a restore, if local disk space is tight. This is only a concern when cache-size-limit is >= 0 and the destination does not support selective downloads (retrieving parts of arc files), so entire arc files are downloaded.

As user files are removed from older backups with the rm or retain commands, "holes" are created in arc files. If enough data is removed, more than pack-percent-free, archives are automatically packed to remove empty space (see pack-* options). If all data from an archive is removed, the archive is deleted without a pack operation. This is why smaller archive files can be more space efficient than larger files: when data is removed, smaller arc files increase the probability that whole arc files can be removed without a pack.

A disadvantage of very small arc files, besides creating many arc files for large backups, is that very small arc files may defeat download optimizations. For example, with 1MB arc files, the maximum request size to remote storage services is limited to 1MB. A backup of a 1GB file will create 1000 arc files and downloading this will require 1000 x 1MB remote requests. If the storage service has high latency (delay for each request) it can cause performance problems. Using arc-size-limit below 4MB is not recommended, and in general, arc-size-limit should be somewhat proportional to your backup size.

block-size - the variable block size, default is auto. Can be set to 16K, 32K, 64K, 128K, 256K, 512K or 1M. The effective average block size will be 1.5 times larger because of the variable size. If this is changed on an existing backup at rev r, any files changed after rev r will be saved with the new block size and cannot dedup against files changed before rev r. A block size warning is displayed during backup when this happens. Once files are saved with the new block size, dedup will occur the next time the file is changed. This option cannot be used to set a large fixed block size. To do that, use -B4M for example on the backup command line.

block-size-ext - sets the backup block size for specific filename extensions (suffixes), overriding backup’s -B command line option. For example, with '-B4M mov,avi -V128K .sql -B16K ibd -B23K xyz', backup will use large fixed-size 4M blocks for movie and video files, variable 128K blocks for text dumps of SQL databases with the .sql extension, fixed 16K blocks for an Innodb database file, and fixed 23K blocks for .xyz files. Commas and periods are optional. The value must be quoted since it contains spaces. Unlike most config options, this option uses 1024 as a multiplier so 8K means 8192 bytes.

backup-linux-attrs - when True, backup will save Linux file attributes, also called flags in BSD Unix. On Linux, file attributes are set with the chattr command and displayed with lsattr. They are little used and poorly implemented on Linux, requiring an open file descriptor and an ioctl call. This can cause permission problems, especially in shared hosting environments, so the default is False.

File attributes are not the same as extended attributes, also called xattrs. Extended attributes are always backed up if present. Extended attributes are handled by the attr, setfattr, and setfattr commands on Linux, as well as the Linux ACL commands. `

cache-size-limit - specifies the maximum amount of space to use in the local backup directory to cache archive files. By default, this is -1, meaning to keep local copies of all archive files. Setting cache-size-limit to any other value will restrict the number of archives kept locally. Remote destinations must be setup in dest.conf or this config option is ignored. The cache size limit is more of a desired goal than a hard limit. The limit may often be exceeded by the size of 1 archive file, and may need to be greatly exceeded for restores, to prevent downloading the same archive file more than once. The mount command respects the limit but might require downloading the same file more than once. A reasonable value for this limit is the average size of your incremental backups. A reasonable minimum is the highest workers setting for any destination plus 1. The default number of workers is 2, so the minimum would be 3, but a more reasonable limit would be 1GB (for 100MB arc files). If you plan to use the mount command extensively, a larger cache will prevent multiple downloads of the same data.

If you can afford the disk space, keeping a copy of archives locally has many advantages:

  1. backups won’t stall

  2. mount does not need to download data and can run concurrently with backups

  3. restores will not have to wait for lengthy downloads

  4. archive pack operations are faster and less expensive because only uploads are required - no downloads

  5. less remote storage space is used because pack operations occur more frequently

  6. you have a redundant copy of your backup.

Thanks to compression and dedup, total backup space is usually about 50%-100% the size of the original data, even when multiple versions are maintained.

Examples of cache size limits:

  • -1 means copies of all archives are kept in the local backup directory. This is the default setting.

  • 0 means no archives should be kept locally. During backup, up to 2 archives are kept to ensure overlap between backup and network transmission. After the backup completes, all local arc files that have been sent to all destinations are deleted.

  • 1-1000 means to size the cache to hold this many full archives. If the archive size is 100MB and the limit is set to 10, the effective cache size would be 1gb. A guideline is to use workers (in dest.conf) + 1 as a minimum.

  • 10gb specifies the cache size directly. Other suffixes can be used: mb, etc.

copy-executable - when True, copy the hb program to remote destinations. The default is False. The hb program is always copied to the local backup directory as hb#b, where b is the build number. The last 3 versions are kept in the local backup directory. If using HB only occasionally, more as an archive, it is important to set this true so that you will have a copy of HB stored with your archive.

db-check-integrity - controls when a database integrity check occurs. The default is selftest. This is fine when backups are run and stored on enterprise-class hardware: non-removable drives, ECC RAM, and hardwired networks.

For extra safety, set this option to upload on consumer-grade hardware using removable drives, non-ECC RAM, and wireless networks where there is more of a possibility of database damage. This will verify database integrity before each upload to destinations, preventing a damaged local database from overwriting a good version stored remotely.

db-history-days - keep incremental database backup files (hb.db.N) for this many days before the latest backup. The default is 3. Combined with recover --check, this allows recovering earlier versions of the main backup database. The minimum value is 1 so there is always at least one older version of the database stored on remotes.

dbid - a read-only config option that is unique for each backup database

dbrev - a read-only config option describing the database revision level

dedup-mem - the amount of RAM to be used for the dedup table, similar to the -D backup option. The -D option overrides this config option. See Dedup Info for more detailed information about sizing the dedup table. The default is 0, meaning dedup is disabled. A reasonable value is 1gb. The dedup table does not immediately use all memory specified; it starts small and doubles in size when necessary, up to this limit. The backup command shows how full the current dedup table is, and how full it is compared to the maximum table size. When the dedup table is full and cannot be expanded, backups will continue to work correctly; dedup is still very effective even with a full table. Unlike most config options, this option uses 1024 as a multiplier so 100M means 104,857,600 bytes.

disable-commands - list of commands that should be disabled, separated by commas; all other commands are enabled. If the admin-passphrase option is set, these commands will ask for the passphrase and run only if it is correctly entered. If there is no admin-passphrase set, these commands refuse to run. The upgrade and init commands cannot be disabled because when these commands execute there is no database to check whether they should be enabled or disabled. To cancel this option, use the value '' (two single quotes).

SECURITY NOTE: Disabling commands without an admin passphrase is a poor configuration since commands can easily be re-enabled.

enable-commands - list of commands that should be enabled, separated by commas; all other commands are disabled. An example would be to enable only the backup command. Then it would not be possible to list, restore, or remove files without the admin passphrase. To cancel this option, use the value '' (two single quotes).

SECURITY NOTE: Enabling some commands without an admin passphrase is a poor configuration since more commands can easily be enabled.

no-backup-ext - list of file extensions, with or without dots, separated by spaces or commas. Files with these extensions will not be backed up.

no-backup-tag - list of filenames separated by spaces or commas. Directories containing any of these files will not have their contents backed up, other than the tag file(s). Typical values are .nobackup and CACHEDIR.TAG Each entry in this list causes an extra stat system call for every directory in the backup, so keep the size of this list to a minimum.

no-compress-ext - list of file extensions of files that should not be compressed. Most common extensions are already programmed into the backup program, but you can list less common extensions here. Extensions may or may not have dots, and are separated by spaces or commas.

no-dedup-ext - list of file extensions of files with data that does not dedup well, for example, photos. If exact copies of files are backed up, they will be still be deduped if dedup is enabled.

pack-age-days - the minimum age of an archive before it is packed. Some storage services charge penalties for early removal of files, for example, Amazon S3 Infrequent Access and Google Nearline. This setting prevents packing archives until they have aged, avoiding the delete penalty charge (but you are still paying more for storage of data that perhaps could have been deleted earlier). All other conditions must be satisfied before an archive is packed. The default for pack-age-days is 30. To disable this feature, set it to zero. Higher numbers will reduce the number of pack operations but may increase the backup storage requirements. If you do not pay delete penalties and have low or zero download fees (or a free allowance) or have cache-size-limit set to -1 (local copy of arc files), setting this to zero will cause more frequent packing to lower storage costs, improve restore times, and lower restore cache requirements.

This option also controls the frequency of packing operations. If set to 30 days, packing will only be attempted every 30 days. However, if a packing operation cannot complete because of a download or time limit, it will resume the next time rm or retain are run. If set to zero, rm and retain will try to pack archives on every run.

pack-bytes-free - the minimum free space in an archive before it is packed. This setting prevents packing very small archives when data is removed, ie, once an archive gets down to 1MB (the default setting), it is not packed until it is empty, then is deleted. All other conditions such as pack-percent-free must also be satisifed before the archive is packed. The default for pack-bytes-free is 1MB. To disable this feature, set it to 4096 (the minimum value). Higher numbers will increase the backup storage requirements and reduce the number of pack operations.

pack-combine-min - the minimum size of an arc file before it is merged into a larger arc file. The default is 1MB, meaning that when possible, consecutive arc files smaller than 1MB will be merged together. For combining to occur, packing must occur (see the other pack-* config settings). For technical reasons, only consecutive arc files can be merged into larger arc files.

pack-download-limit - the maximum amount of data that can be downloaded during a single run of rm or retain for packing. The default is 950MB. To prevent any downloading of remote arc files for packing, set this limit to 0 (but see below). For "unlimited" downloading, set the limit to a high value like 1TB. This limit only applies when cache-size-limit is >= 0, meaning that not all arc files are stored locally; local arc files are packed without a download.

It is recommended that pack-download-limit not be set to zero. When files are removed from the backup with rm and retain, "holes" are created in arc files. Over time, this causes data to be stored inefficiently, making restores slower. Packing reorganizes arc files into more efficient storage. To reduce the amount of packing, raise pack-percent-free to a high number like 95, meaning that 95% of an arc file must be free before it is packed. This will prevent nearly all downloading except tiny files smaller than pack-combine-min and very inefficient arc files with mostly empty space.

If pack-download-limit is set to a value smaller than the largest archive in the backup, a warning may be displayed that this archive cannot be packed. If more space frees up over time however, the archive will eventually be packed if selective download is available on the destination.

pack-percent-free - when archive files have this percentage or more as free space, the archive is packed to recover disk space. This is used by rm and retain after they have deleted backup data. The default is 50. If cache-size-limit is >= 0 and you pay for downloaded data, consider raising this percentage to avoid frequent packing. Most storage services have very high download charges compared to their storage charge, often 10-17x higher. If cache-size-limit is not set (the default), packing more often is fine because no download is required. Higher numbers will increase the backup storage requirements but reduce the number of pack operations and reduce the number of downloads when cache-size-limit is >= 0.

remote-update - when set to normal (the default), HB sends new data before deleting old data. This preserves the integrity of remote backup areas. However if a remote disk becomes full it may be necessary to delete the old data before sending new data. To enable this behavior, set to unsafe.

When set to unsafe, an interrupted HB command may leave the remote backup area temporarily inconsistent. The next successful HB command should correct it.

retain-extra-versions - adds an extra period to each -s interval as a retention safety cushion, so -s7d4w becomes -s8d5w The default is True.

shard-output-days - the number of days to keep shard output in the sout subdirectory. The default is 30 days. Set to 0 to keep all shard output.

simulated-backup - if set to true before the first backup, no arc files are created by the backup command. This allows modeling backup options such as -B (blocks size) and -D (dedup table size), even for very large backups, without using a lot of disk space. Simulated backups also run faster because there is less I/O. Incremental backups, rm, and retain all work correctly, and the stats command can be used to view statistics such as the backup space that would be used by a real backup. The pack-…​ keywords control whether the simulated arc files are packed when files are removed from the backup with rm or retain.

Differences for a simulated backup:

  • must be set before the initial backup

  • cannot be changed after the initial backup

  • no arc files are created (not a real backup)

  • no files are sent to remote destinations

  • selftest is limited to -v2

  • get & mount will fail with "No archive" errors