Destinations‎ > ‎

Amazon S3

Amazon S3: https://aws.amazon.com/s3

Google Storage: https://cloud.google.com/storage

DreamHost DreamObjects: https://www.dreamhost.com/cloud/storage


HashBackup supports Amazon's S3 object storage service for offsite backup storage, including S3-compatible services such as Google Storage, DreamObjects, SoftLayer, etc.  This is th reference page for Amazon's S3 service.  Compatible services may have a separate page explaining special considerations.


Selective Download


Selective download is supported on S3 and compatibles, and allows HB to download only the parts of files that are needed for an operation, saving time, bandwidth, and download costs.


Lifecycle Policies & Glacier Transitioning


Amazon S3 has lifecycle policies that allow transitioning S3 files to Glacier and automatically deleting files after a certain time.  These should not be used with HashBackup since HB cannot access files transitioned to Glacier, and file retention and deletion is managed by HB.  As an alternative to Glacier transitioning, use S3's Infrequent Access storage class to reduce expenses (see class keyword below).


File Transfer Verification


Uploads: HB generates a hash, the file is uploaded along with the hash, then S3 generates a hash of the file it receives and verifies that it matches the hash HB sent.  HB may use multipart upload, where all workers cooperate to send a single large file.  Upload hash verification occurs for regular and multipart uploads.


Downloads: if a file was sent without multipart upload, HB verifies that the MD5 hash of a downloaded file is the same MD5 that was sent.  Multipart uploads are not verified on download (but keep reading).


HB often requests partial file downloads to save download costs.  The S3 file hash cannot be verified with partial file downloads, but HB always verifies the SHA1 of every backup block before using it, and verifies the SHA1 of every file restored.


S3's dest.conf Keywords


type (required)


s3 for Amazon S3

gs for Google Cloud Storage

do for Dreamhost Objects

s3 for other S3-compatible services, use host and port keywords


host


This optional keyword is used with S3-compatibles to specify the region and host name.  It is not used for Amazon's S3 service - only for compatibles.


port


This optional keyword is used with S3-compatibles to specify the port where the S3 service is running.  The default is port 80.  It is not used for Amazon's S3 service - only for compatibles.


subdomain


This optional true / false keyword is useful for S3-compatibles that do not support bucket addressing as part of the host name.  Using subdomain false makes self-hosted Minio installations easier to manage because adding new buckets does not require DNS changes.  The default is true.


accesskey and secretkey


Your access and secret keys can be specified in the dest.conf file with the accesskey and secretkey keywords.  These can be used with any S3-compatible destination and take priority over environment variables (below).


SECURITY NOTE: your access key is not a secret, does not have to be protected, and is sent in the headers with every request.  It is like a username or account id.  But,


   YOUR SECRET KEY SHOULD BE PROTECTED AND NEVER DISCLOSED


Your secret key is not sent with your request. It is the password to your S3 account and is used to sign requests.  Files containing your secret key should be protected.


If the accesskey and/or secretkey keywords are not in dest.conf, environment variables are checked.  These environment variables have different names for each provider, allowing you to have both Amazon and Google Storage accounts configured.  The environment variable names are:


Amazon S3:                AWS_ACCESS_KEY_ID    AWS_SECRET_ACCESS_KEY

Google Storage:         GS_ACCESS_KEY_ID     GS_SECRET_ACCESS_KEY

Dreamhost Objects:   DO_ACCESS_KEY_ID     DO_SECRET_ACCESS_KEY


The environment variables are set and exported in your login script.  For example, in .bashrc in your home directory:


   export AWS_ACCESS_KEY_ID=myverylongaccesskey 

   export AWS_SECRET_ACCESS_KEY=myverylongsecretkey


NOTE: for Google Storage, you must generate Developer Keys.  See: https://developers.google.com/storage/docs/migrating#migration-simple


bucket (required)


S3 destinations requires a bucket name.  The bucket will be created if it doesn't exist.  If the location keyword is not used the bucket is created in the default region, us-east-1.


Bucket names are globally unique, so names like "backup" are probably taken.  Add a company name, host name, random number, or random text as a prefix or suffix to make your bucket name unique.


For S3-compatible services you may need to create the bucket before using HashBackup, especially to customize bucket settings like storage class.


Bucket names must be 3-63 characters, must start and end with a letter or digit, and can contain only letters (case insensitive), digits, and dashes.  More bucket name rules at:


http://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html


class


S3 and compatible services often have multiple storage classes, with different classes having different cost structures, features, and/or restrictions.  Some services such as Google Storage set the storage class at the bucket level, while others such as Amazon S3 set the storage class at the object (file) level.  For services with bucket-level storage classes, use their website to set the storage class.  The class keyword is used for Amazon S3 to set the storage class of uploaded files.  There are two possible values:


- standard sets the standard S3 storage class (default if class isn't used)

- ia sets the Infrequent Access storage class

- Reduced Redundancy Storage is not supported by HashBackup


HashBackup may store individual files in the standard class if it will be cheaper than IA storage.  For example, small files are cheaper to store in standard storage because IA has a 128K minimum file size.  Files that might be deleted soon are stored in standard storage to avoid the early delete penalty, though a file's lifetime is usually hard to predict.


dir


This keyword allows many backups to be stored in the same bucket by prepending the keyword value to the backup filename.  Without the dir keyword, backup will create arc.0.0 in the top level of the bucket.  With the dir keyword and value abc/xyz, backup will create abc/xyz/arc.0.0 in the bucket.


IMPORTANT: if you have an existing S3 backup and want to start using dir, you will have to use an S3 utility to move the backup files already stored.  The easiest way is to use S3 copy requests to create the new objects, then delete the old objects.  Then add the dir keyword.


location (recommended)


Specifies the Amazon region where a bucket is created or located.  If omitted, US is used (us-east-1).  Possible values are:


US = same as us-east-1

EU = same as eu-west-1

any other valid S3 region


Region names are on Amazon's S3 site: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region


Buckets live in a specific region.  It's important that a bucket's correct region be specified or all requests are sent to us-east-1 and then redirected to the proper region.  Sometimes these redirects fail and cause connection reset errors.


multipart


This true/false keyword controls whether HB uses multipart uploads.  For Amazon S3 and Dreamhost, the default is true. For Google Storage, the default is false because their S3-compatible API does not support multipart uploads.


debug


Controls debugging level.  When set to 1 or higher, extra debugging messages are either displayed or sent to a log file <destname>.log in the backup directory.


timeout (not supported)


The timeout for S3 connections is 5 minutes and cannot be changed.


rate


Specifies the maximum upload bandwidth per worker.  See Destinations for details.



Example S3 dest.conf:


destname myS3

type s3

location US

accesskey myaccesskey

secretkey mysecretkey

bucket myaccesskey-hashbackup

dir myhost1

class ia


Example Softlayer dest.conf:


destname softlayers3

type s3

host s3-api.us-geo.objectstorage.service.networklayer.com

accesskey xxx

secretkey xxx

bucket xxx

location US


Example Minio dest.conf:


destname minio

type s3

host play.minio.com

port 9000

multipart false

subdomain false

accesskey xxx

secretkey xxx

bucket xxx

location US

Comments