Init

Initializes a backup area:

$ hb init -c backupdir [-k key] [-p ask/env] [--shards n]

The init command creates and initializes the backup directory. If the directory already exists, it must be empty or init will complain to avoid overwriting a directory by mistake.

During initialization, an inex.conf file will be created, listing files that should be excluded from the backup. You may want to review and modify this before your first backup.

The -k option specifies your own encryption key. Normally a random key is generated by init, and -k is not recommended. But in some environments you may want a common key for several backups, a key that’s easier to remember, or may want a blank key. These are all less secure than a random key, but also less likely that the key is lost. A key with special characters will require quotes, for example: -k 'my special key'. Spaces are always ignored in keys, so the key abc def is equivalent to abcdef. To set a blank key, use -k '' (two single quotes). The key can be changed later with the rekey command.

The -p option adds a passphrase to protect your key. -p ask means every HB command, including init, will ask for the passphrase from the keyboard. -p env means every HB command, including init, will read the passphrase from a shell or environment variable named HBPASS. This environment variable must be set before running the init command.

Since HashBackup uses a database that requires fsyncs to avoid database corruption and locks to prevent unexpected concurrent access, it is recommended to use a local directory with -c. These features sometimes do not work correctly with USB devices and network storage. If you are tight on disk space, setup a destination with dest.conf and use the cache-size-limit config option to conserve space in the local backup directory. For example, a common configuration is to use an SSD for the local backup directory for fast database access, but there may not be room for all of your backup data. If you still want a copy of the backup on a local disk, which is a good idea for many reasons, a dir destination can be used in dest.conf with the destination directory on a large spinning disk or network file server. The database remains on fast media and the large backup files containing your backup data are stored on slower media.

If you back up directly to remote storage (the -c backup directory is on remote storage), the encryption key is also stored there. If you don’t own & control the remote storage, for example, you are backing up directly to Google Drive with -c, it is important to use a passphrase to protect your backup.

Choosing a Passphrase

All of HashBackup’s security comes from your key. This is why hb init creates a random key by default: it is next to impossible for someone to guess a long random key. Here are some suggestions for creating a strong passphrase to further protect your key:

  1. Make up a sentence that you will remember and use this as your passphrase. A sentence is easier to type than a password like Wjd0$p2^! and is stronger because it is longer. Length wins over weird, hard to type, hard to remember passwords. Example: the fat green Martian landed his shiny silver spacecraft

  2. Make up a sentence that you will remember and use the first letter of each word as your passphrase. For the Martian sentence above, that would be: tfgMlhsss

  3. Including special symbols (other than spaces) will increase the passphrase strength. One easy way to do this is to use special symbols before, after, and/or between words, for example: .,.this.,.is.,.a.,.decent.,.passphrase.,.! But make up your own special symbol rule. Even adding just one special symbol, especially in the middle, will increase your passphrase strength.

  4. Including a number, especially in the middle, will increase your passphrase strength

  5. Use a password manager program. These store lists of passwords and passphrases in an encrypted file, protected by a master passphrase. They often have password generators built in and you can cut and paste a passphrase when needed.

  6. To learn more about the importance and methods of choosing a good passphrase, do a search for: strong password / passphrase, password / passphrase strength, or password / passphrase entropy

Creating Backups With a Common Key

You may want several backup directories with the same key. Here is the procedure:

Create a backup directory with hb init. If you already have a backup directory, skip this step.

$ hb init -c hb
HashBackup build #1597 Copyright 2009-2016 HashBackup, LLC
Backup directory: /Users/jim/hb
Permissions set for owner access only
Created key file /Users/jim/hb/key.conf
Key file set to read-only
Setting include/exclude defaults: /Users/jim/hb/inex.conf

VERY IMPORTANT: your backup is encrypted and can only be accessed with
the encryption key, stored in the file:
    /Users/jim/hb/key.conf
You MUST make copies of this file and store them in a secure location,
separate from your computer and backup data.  If your hard drive fails,
you will need this key to restore your files.  If you setup any
remote destinations in dest.conf, that file should be copied too.

Backup directory initialized

Display the key file:

$ cat hb/key.conf
# HashBackup Key File - DO NOT EDIT!
Version 1
Build 1597
Created Sat Jul 30 14:59:10 2016 1469905150.7
Host Darwin | mb | 10.8.0 | Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 | i386
Keyfrom random
Key 3b93 2b0b bf19 3f3d 925a 0f71 d5cf 3034 6b0b a14b e9f3 540e 6f1c 86aa 1687 3ae7

For all of the backups you want to have the same key, including the one you just displayed, use the hb rekey command. The backup can have data or be empty. You can copy and paste the key value using Ctrl-C (Command-C on Mac) and Ctrl-V (Command-V on Mac). There is a warning displayed about -k being less secure, but that is not really true in this case because instead of choosing a key value yourself, HashBackup picked a random key.

$ hb rekey -c hb -k '3b93 2b0b bf19 3f3d 925a 0f71 d5cf 3034 6b0b a14b e9f3 540e 6f1c 86aa 1687 3ae7
HashBackup build #1597 Copyright 2009-2016 HashBackup, LLC
Backup directory: /Users/jim/hb
Backing up databases and key
Generating new key
Encrypting databases with new key
Installing new key
Copied old key to /Users/jim/hb/key.conf.orig
Created key file /Users/jim/hb/key.conf
Key file set to read-only
Deleting rekey backup files
Rekey complete

Warning: you have specified the key, which is less secure than a
random key.  Omit the -k option for more security.

VERY IMPORTANT: your backup is encrypted and can only be accessed with
the encryption key, stored in the file:
    /Users/jim/hb/key.conf
You MUST make copies of this file and store them in a secure location,
separate from your computer and backup data.  If your hard drive fails,
you will need this key to restore your files.  If you setup any
remote destinations in dest.conf, that file should be copied too.

Now what does the key file look like?

$ cat hb/key.conf
# HashBackup Key File - DO NOT EDIT!
Version 1
Build 1597
Created Sat Jul 30 14:59:48 2016 1469905188.28
Host Darwin | mb | 10.8.0 | Darwin Kernel Version 10.8.0: Tue Jun  7 16:33:36 PDT 2011; root:xnu-1504.15.3~1/RELEASE_I386 | i386
Keyfrom user
Key 3b93 2b0b bf19 3f3d 925a 0f71 d5cf 3034 6b0b a14b e9f3 540e 6f1c 86aa 1687 3ae7

Repeat the rekey operation on every backup you want to have identical keys. After rekey, all of the key.conf files will have the same Keyfrom and Key fields. In the future, create new backup directories with hb init and then use hb rekey with -k to make the key identical to your other backups. If you prefer, you can choose your own common key to use with rekey, but as the warning says, this will be less secure than HashBackup creating a random key. You cannot copy key files to different backup directories and cannot use an editor to change Keyfrom from "random" to "user" in the key.conf file.

Sharded Backups

Sharding engages multiple HashBackup processes to operate in parallel on a very large backup, dividing the work between them. The number of parallel processes is specified with the --shards N option. HashBackup creates a main backup directory as usual, but also creates N subdirectories, one for each shard. When an HB command is used with the main backup directory, it actually starts N HB processes to do the work, one for each subdirectory, monitors their progress, and displays their output when each process finishes.

The number of shards cannot be changed after init, so it’s a good idea to run experiments to decide the best number of shards for your environment. It is typically limited by the speed of the I/O subsystem, number of CPUs available, and the amount of memory. Backup’s --sample option and the simulated-backup config option are helpful when the filesystem is so large that full-scale tests are impractical.

Because shard processes run independently and divide files between themselves, related files may end up in different shards. HashBackup cannot dedup between shards, so sharding may lower the amount of dedup that occurs. However, pathnames are locked to a specific shard, so incremental backups and dedup between multiple versions of the same pathname work very well.