Amazon S3: https://aws.amazon.com/s3
Google Storage: https://cloud.google.com/storage
Storj Decentralized Storage: https://www.storj.io
HashBackup supports Amazon’s S3 object storage service for offsite backup storage, including S3-compatible services such as Google Storage, Wasabi, and Storj. This is the reference page for Amazon’s S3 service. Compatible services may have a separate page explaining special considerations, or see below for compatible configurations.
Selective download is supported on S3 and compatibles, allowing HB to download only the parts of files that are needed for an operation, saving time, bandwidth, and download costs.
Amazon Lifecycle Policies & Glacier Transitioning
Amazon S3 has lifecycle policies that allow transitioning S3 files to
Glacier and automatically deleting files after a certain time. These
should not be used with HashBackup since HB cannot access files
transitioned to Glacier, and file retention and deletion is managed by
HB. As an alternative to Glacier transitioning, use S3’s Infrequent
Access storage class to reduce expenses (see
class keyword below).
File Transfer Verification
Uploads: HB generates a hash, the file is uploaded along with the hash, then S3 generates a hash of the file it receives and verifies that it matches the hash HB sent. HB may use multipart upload, where all workers cooperate to send a single large file. Upload hash verification occurs for regular and multipart uploads.
Downloads: if a file was sent without multipart upload, HB verifies that the MD5 hash of a downloaded file is the same MD5 that was sent. Multipart uploads are not verified on download (but keep reading).
HB often requests partial file downloads to save download costs. The S3 file hash cannot be verified with partial file downloads, but HB always verifies the SHA1 of every backup block before using it, and verifies the SHA1 of every file restored.
It is important that your system clock is set accurately for S3 because a timestamp is sent with every request as part of the S3 protocol. If your system clock is too far off it will cause 403 Access Forbidden errors from any S3-like destination.
Free Tier Egress
Amazon S3 offers a 100GB per month free egress (download) allowance
that can be put to good use by HashBackup to verify your backup data
-inc -v4 selftest and to keep your backup data
compacted with the
pack-download-limit config option. Adjust the
limits on these two features to stay under the 100GB download limit
s3 for Amazon S3
gs for Google Cloud Storage’s S3 interface
s3 for other S3-compatible services; requires
This optional keyword is used with S3-compatibles to specify the region and host name, sometimes called the endpoint. It is not used for Amazon’s S3 service but is required for compatibles. For example:
This optional keyword is used to specify the port the S3 service is using. The default is port 80 for regular http connctions or port 443 for secure connections.
This optional true / false keyword enables SSL. It also enables SSL if used without a value. The default is false because even over regular http the S3 protocol is resistant to attacks:
each S3 request is signed with your secret key and a timestamp
all user data sent by HB is encrypted
the only real attack vector is a replay attack
replay attacks can only happen 5-15 mins after the original request
HB rarely reuses filenames & filenames are part of the signature
If your bucket name contains a dot (not recommended), SSL will not
work because of an AWS restriction on bucket names. You may be able
subdomain false to temporarily get around this limitation,
but AWS is deprecating URL-based bucket addressing.
Many S3-compatibles only work with SSL enabled so the
This optional true / false keyword is useful for S3-compatibles that
do not support bucket addressing as part of the host name. Using
subdomain false makes self-hosted Minio installations easier to
manage because adding new buckets does not require DNS changes. The
default is true.
|AWS is deprecating path-based addressing (subdomain false).|
Your access and secret keys can be specified in the
secretkey keywords. These can be used with
any S3-compatible destination and take priority over environment
SECURITY NOTE: your access key is not a secret, does not have to be protected, and is sent in the headers with every request. It is like a username or account id. But, YOUR SECRET KEY SHOULD BE PROTECTED AND NEVER DISCLOSED!
Your secret key is not sent with your request. It is the password to your S3 account and is used to sign requests. Files containing your secret key should be protected.
secretkey keywords are not in
environment variables are checked. These environment variables have
different names for each provider, allowing you to have both Amazon
and Google Storage accounts configured with environment variables.
The environment variable names are:
The environment variables are set and exported in your login script.
For example, in
.bashrc in your home directory:
export AWS_ACCESS_KEY_ID=myverylongaccesskey export AWS_SECRET_ACCESS_KEY=myverylongsecretkey
|For Google Storage, you must generate Developer Keys. See: https://developers.google.com/storage/docs/migrating#migration-simple|
S3 destinations requires a bucket name. The bucket will be created if
it doesn’t exist. If the
location keyword is not used the bucket is
created in the default region, us-east-1.
Bucket names are globally unique, so names like "backup" are probably taken. Add a company name, host name, random number, or random text as a prefix or suffix, perhaps with a dash, to make your bucket name unique.
For S3-compatible services you may need to create the bucket before using HashBackup, especially to customize bucket settings like storage class.
Bucket names must be 3-63 characters, must start and end with a letter or digit, and can contain only letters (case insensitive), digits, and dashes. More bucket name rules at: http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html
S3 and compatible services often have multiple storage classes, with different classes having different cost structures, features, and/or restrictions. Some services such as Google Storage set the storage class at the bucket level, while others such as Amazon S3 set the storage class at the object (file) level. For services with bucket-level storage classes, use their website to set the storage class. The class keyword is used for Amazon S3 to set the storage class of uploaded files. The value can be:
standardsets the standard S3 storage class (default if
iasets the Infrequent Access storage class
anything else is passed directly to S3 after uppercasing
HashBackup may store individual files in the standard class if it will be cheaper. For example, small files are cheaper to store in standard storage because all others have a 128K minimum file size. Files that might be deleted soon, such as hb.db.N files, are stored in standard storage to avoid the early delete penalty, though a file’s lifetime is usually hard to predict.
This keyword allows many backups to be stored in the same bucket by
prepending the keyword value to the backup filename. Without the
dir keyword, backup will create arc.0.0 in the top level of the
bucket. With the
dir keyword and value
abc/xyz, backup will
abc/xyz/arc.0.0 in the bucket.
If you have an existing S3 backup and want to start using
Specifies the Amazon region where a bucket is created or located. If omitted, US is used (us-east-1). Possible values are:
US = same as us-east-1
EU = same as eu-west-1
any other valid S3 region
Region names are on Amazon’s S3 site: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
Buckets live in a specific region. It’s important that a bucket’s correct region be specified or all requests are sent to us-east-1 and then redirected to the proper region. Sometimes these redirects fail and cause connection reset errors.
This keyword specifies a fixed partsize for multipart S3 uploads. The
default is 0, meaning HB chooses a reasonable part size from 5MB (the
smallest allowed) to 5GB (the largest allowed), based on the file
size. When the
partsize keyword is used, HB uses this part size to
determine the number of parts needed, then "levels" the part size
across all parts. For example, if uploading a 990MB file with a
partsize of 100M, HB will use 10 parts of 99M each. The size can be
specified as an integer number of bytes or with a suffix, like 100M or
100MB. The suffix is interpreted as a power of 2, so M means MiB, ie,
100M = 100 * 1024 * 1024.
This true/false keyword controls whether HB uses multipart uploads and downloads. The default is true. For Google Storage, the default is false because their S3-compatible API does not support multipart uploads.
When multipart upload is enabled, large arc files > 5GB are supported.
If multipart upload is disabled or not available, the S3 file size
limit is 5GB, so the config option
arc-size-limit should not be set
larger than 4GB.
Controls debugging level. When set to 1 or higher, extra debugging
messages are either displayed or sent to a log file
in the backup directory.
timeout (not supported)
The timeout for S3 connections is 5 minutes and cannot be changed.
Specifies the maximum upload bandwidth per worker. See Destination Setup for details.
Example S3 dest.conf
destname myS3 type s3 location US accesskey myaccesskey secretkey mysecretkey bucket myaccesskey-hashbackup dir myhost1 class ia
Amazon Infrequent Access has a 30-day delete penalty on all
objects, meaning that if you upload 1GB and then delete it the next
day, you still pay for 30 days of storage. To minimize download
costs, the default for config option
Example Minio dest.conf
destname minio type s3 host play.minio.com port 9000 multipart false subdomain false accesskey xxx secretkey xxx bucket xxx location US
Example Storj dest.conf - 150GB free!
destname storj type s3 host gateway.us1.storjshare.io partsize 64m secure accesskey xxx secretkey xxx bucket xxx
Example Tebi dest.conf - 50GB free!
destname tebi type s3 host s3.tebi.io accesskey xxx secretkey xxx bucket xxx
|Tebi’s free allowance is 50GB for no redundancy, 25GB for a redundant copy, 12GB for 3 copies.|
Example Wasabi dest.conf
destname wasabi type s3 host s3.wasabisys.com accesskey xxx secretkey xxx bucket xxx
Wasabi has a 90-day delete penalty on all objects, meaning
that if you upload 1GB and then delete it the next day, you still pay
for 3 months of storage. Unlike Amazon S3, Wasabi does not have a
standard storage class without delete penalties for files such as