Backblaze B2
Advantages:
-
Free to upload backup data
-
Low cost storage: 0.6 cents/GB/month ($0.006/GB/mo or $6/TB/mo)
-
No minimum storage time
-
No minimum file size
-
First 10GB of storage is free every month
-
Free retrieval allowance: 3x monthly average stored data
-
Low cost to retrieve data: 1 cent/GB ($0.01/GB) after free allowance
-
Supports selective downloads to lower retrieval costs
-
Backblaze will ship your data on a disk, free if you return the drive
-
Each file is spread across 20 servers, any 17 can reconstruct the original
-
Data is actively "scrubbed" by Backblaze while on B2
-
SHA1 hash verification on file transfers
File Transfer Verification
Uploads: HB generates a SHA1 hash for every file uploaded to B2. The B2 service verifies that the SHA1 sent by HB matches the SHA1 B2 generates for the data it received. This verifies that the file was received correctly by B2. If this check fails, B2 signals an error and HB retries the upload.
Downloads: B2 sends the file’s SHA1 with the downloaded file. HB computes the SHA1 of the file it receives and compares it to the SHA1 B2 sent to make sure there were no transmission errors. HB also compares this to the SHA1 sent when the file was uploaded. This verifies that the file received by HB is exactly what it uploaded. If any tests fail, an error is signaled and HB retries the download.
HB sometimes uses partial file downloads to save download costs. The B2 file SHA1 cannot be verified with partial file downloads, but HB always verifies the SHA1 of every backup block before using it, and verifies the SHA1 of every file restored. |
Versioning
B2 versions files by default. That means if file1
is created in the
bucket, then deleted, it is not really deleted. If the same file is
uploaded again, there will be 2 copies. This behavior is not useful
for HashBackup, so the first time a bucket is accessed, HB will set a
lifecycle rule that means "only keep 1 version". If there is already
a lifecycle rule, HB does not change it. HB also does an immediate
delete of the extra versions B2 creates to save storage costs.
5GB File Size Limit
B2 has a file upload size limit of 5GB unless special upload
procedures are used. HB does not use B2’s "large file upload"
procedures, so the arc-size-limit
config keyword should be set to
4GB or less.
B2 Storage Optimization
When retain and rm are used to remove files from the backup, they creates "holes" in the arc files. HashBackup can do a pack operation to remove these holes and optimize / minimize backup storage space and costs. Packing is controlled by several config options.
With Backblaze B2, download costs are only slightly higher than
storage costs, so packing of remote arc files can be more aggressive.
By default, remote archives with pack-percent-free
bytes of free
space are downloaded and repacked periodically (every 7 days) when rm
and retain are run, for up to 950MB of downloaded data. For more
aggressive packing that is still cost efficient, these config settings
are recommended for B2:
-
pack-download-limit <N>GB
where N is some reasonable value for your site. Even with a high value forN
, this is cost efficient for B2 because download costs are not much higher than storage costs. -
pack-percent-free 60
For even more aggressive remote packing, use a smaller value for
pack-age-days
. 0 means to pack every time rm and retain are run.
B2’s dest.conf
Keywords
type (required)
type b2
accountid
Your B2 Account Id. This keyword has priority over environment variables (more below about environment variables). Either accountid
or keyid
is required, and only one should be used.
appkey
When the accountid
keyword is used, appkey
should be your master application key. When keyid
is used, appkey
should be the corresponding application key. This keyword has priority over environment variables.
Environment variables are checked if accountid
and/or appkey
are not specified in dest.conf
. The environment variable names are:
B2_ACCOUNT_ID
B2_APP_KEY
These environment variables are set and exported in your login script. For example, in .profile
in your home directory:
export B2_ACCOUNT_ID=123456789012
export B2_APP_KEY=1234567890123456789012345678901234567890
If you add these to your .profile
(or .bash_profile
), you should protect the file so only you can access it with: chmod 600 .profile
bucket (required)
Each B2 destination requires a bucket keyword. B2 buckets have both a bucket name and a bucket id. HB tries the value as a bucket name first, then as a bucket id. HB will create a private B2 bucket if the bucket name you use doesn’t exist. Each B2 account can have many buckets. Bucket names are globally unique, so names like "backup" are probably taken and will need something added to make them unique.
Bucket name restrictions:
-
6-50 characters
-
starts and ends with a letter or digit
-
contains only letters, digits, and dashes
B2 allows mixed case in bucket names but behaves as if all letters are the same case, so AbcDef and abcdef are the same bucket.
dir
This keyword allows many backups to be stored in the same bucket by prepending the value to the backup filename. Without the dir
keyword, a backup will create arc.0.0
in the top level of the bucket. With the dir
keyword and value abc/xyz
, the first backup will create abc/xyz/arc.0.0
in the bucket. Leading slashes are stripped because B2 does not allow leading slashes.
if you have an existing B2 backup and want to start using dir , you will have to move the backup files already stored by hand. Then add the dir keyword to dest.conf .
|
keyid
Selects a specific B2 application key by its key id. Each B2 application key has a key id and a key value. These are specified in dest.conf with the keyid
and appkey
keywords.
rate
Limits upload bandwidth per worker. See Destination Setup for details.
workers
Backblaze B2 has somewhat higher latency than Amazon S3 or Google Storage. To compensate, you may want to increase the number of workers (default is 4) for higher performance. Add workers 2-4 at a time until there is no performance improvement. More than 20 workers probably doesn’t make sense. See Destination Setup for details.
debug
If set to 1 or higher, all traffic will be logged to
<destname>.<timestamp>.log
in the backup directory. Confidential
data including authentication tokens and headers are replaced with xxx
in debug logs. All debug values generate the same log data, though
this may change in the future. Setting workers
to 1 makes debugging
easier, and using retries 0
can make failures happen quicker.
Example dest.conf for B2
destname b2
type b2
accountid 0123456789ab
appkey 0123456789abcdef0123456789abcdef0123456789
bucket hbbackup
dir myhost1