Initializes a backup area: Choosing a Passphrase$ hb init -c backupdir [-k key] [-p ask/env] [--shards n] The init command creates and initializes the backup directory. If the directory already exists, it must be empty or init will complain to avoid overwriting a directory by mistake. During initialization, an inex.conf file will be created, listing files that should be excluded from the backup. You may want to review and modify this before your first backup. The -k option specifies your own encryption key. Normally a random key is generated by init, and -k is not recommended. But in some environments you may want a common key for several backups, a key that's easier to remember, or may want a blank key. These are all less secure than a random key, but also less likely that the key is lost. A key with special characters will require quotes, for example: -k 'my special key'. Spaces are always ignored in keys, so the key abc def is equivalent to abcdef. To set a blank key, use -k '' (two single quotes). The key can be changed later with the rekey command. The -p option adds a passphrase to protect your key. -p ask means every HB command, including init, will ask for the passphrase from the keyboard. -p env means every HB command, including init, will read the passphrase from a shell or environment variable named HBPASS. This environment variable must be set before running the init command. Since HashBackup uses a database, fsyncs to avoid database corruption, and locks to prevent unexpected concurrent access, it is recommended to use a local directory with -c. These features sometimes do not work right with USB devices and network storage. If you are tight on disk space, setup a destination with dest.conf and use the cache-size-limit config option to conserve space in the local backup directory. For example, a common configuration is to use an SSD for the local backup directory to give fast database access, but there may not be room for all of your backup data. If you still want a copy of the backup on a local disk, which is a good idea for many reasons, a dir destination can be used in dest.conf with the destination directory on a large spinning disk or network file server. The database remains on fast media and the large backup files containing your backup data are stored on slower media. IMPORTANT: if you do back up directly to remote storage (the -c backup directory is on remote storage), the encryption key is also stored there. If you don't own & control the remote storage, for example, you are backing up directly to Google Drive with -c, it is important to use a passphrase to protect your backup. All of HashBackup's security comes from your key. This is why hb init creates a random key by default: it is next to impossible for someone to guess a long random key. Here are some suggestions for creating a strong passphrase to further protect your key:
Creating Backups With a Common Key You may want several backup directories with the same key. Here is the procedure: Create a backup directory with hb init. If you already have a backup directory, skip this step. $ hb init -c hb Display the key file: $ cat hb/key.conf For all of the backups you want to have the same key, including the one you just displayed, use the hb rekey command. The backup can have data or be empty. You can copy and paste the key value using Ctrl-C (Command-C on Mac) and Ctrl-V (Command-V on Mac). There is a warning displayed about -k being less secure, but that is not really true in this case because instead of choosing a key value yourself, HashBackup picked a random key. $ hb rekey -c hb -k '3b93 2b0b bf19 3f3d 925a 0f71 d5cf 3034 6b0b a14b e9f3 540e 6f1c 86aa 1687 3ae7' Now what does the key file look like? $ cat hb/key.conf Repeat the rekey operation on every backup you want to have identical keys. After rekey, all of the key.conf files will have the same Keyfrom and Key fields. In the future, create new backup directories with hb init and then use hb rekey with -k to make the key identical to your other backups. If you prefer, you can choose your own common key to use with rekey, but as the warning says, this will be less secure than HashBackup creating a random key. You cannot copy key files to different backup directories and cannot use an editor to change Keyfrom from "random" to "user" in the key.conf file. SHARDED BACKUPS Sharding engages multiple HashBackup processes to operate in parallel on a very large backup, dividing the work between them. The number of parallel processes is specified with the --shards N option. HashBackup creates a main backup directory as usual, but also creates N subdirectories, one for each shard. When an HB command is used with the main backup directory, it actually starts N HB processes to do the work, one for each subdirectory, monitors their progress, and displays their output as each process finishes. The number of shards cannot be changed, so it's a good idea to run experiments to decide the best number of shards for your environment. It is typically limited by the speed of the I/O subsystem, number of CPUs available, and the amount of memory. Backup's --sample option and the simulated-backup config option are helpful when the filesystem is so large that full-scale tests are impractical. Because shard processes run independently and divide files between themselves, related files may end up in different shards. HashBackup cannot dedup between shards today, so sharding may lower the amount of dedup that occurs. However, pathnames are "fixed" to a given shard, so incremental backups and dedup between multiple versions of the same pathname work very well. |
Commands >