#1844 Mar 19, 2017 - expires Jul 15, 2017


* IMPORTANT NOTICE: removing obsolete euca support
* expiration date bumped to July 15, 2017
* database upgrade to dbrev 24
* up to 3x faster to create hb.db.N
* hb.db.N files removed from local backup directory
* hb.db.N space savings on remotes
* Google Storage: note on storage classes
* rekey and export are 40% faster, and some fixes
* recover: faster, Glacier code removed, other changes
* backup: 5-6% faster on small files
* backup: add Mem statistic to show peak RAM usage
* backup: fix exception traceback
* backup: fix slow backup of sparse "hole only" files
* better handling of cache-size-limit 0
* S3: update Amazon S3 region list, adding 4 regions
* S3: update boto to latest version
* S3: delete incomplete multipart uploads
* S3: IBM Softlayer S3-compatible support (documentation)
* get: add .hberror to partially restored files
* dest verify/sync: always transmit dest.db


- IMPORTANT NOTICE: Eucalyptus Systems had one of the first
  S3-compatible object stores, "Walrus", and HashBackup supported an
  S3 destination type of "euca".  This was necessary because when
  initially launched, Walrus used a weird pathname to access buckets
  and HB didn't support host and port keywords for s3 destinations.

  Now, a host and port keyword can be added to the s3 destination
  type, making the euca destination type obsolete.  The euca
  destination type will be removed very soon.  If that's a problem,
  please send an email.

- this release does an automatic database upgrade when any HB command
  is used.  Some procedures have changed and the upgrade prevents
  access by earlier HB releases.

- every HashBackup command that modifies the database has to create an
  hb.db.N file to send changes to the remotes.  This is up to 3x
  faster in this version.

- hb.db.N files are no longer kept in the local backup directory,
  saving local disk space.  Running this release will remove local
  hb.db.N files.

- hb.db.N files use less remote storage space and HB tries to avoid
  the "delete penalty" for these files on Amazon Infrequent Access.
  This will shorten recover time and lower storage and download costs.

  HB only manages the storage class when the class keyword is used in
  dest.conf.  If the class keyword is not used, all objects are stored
  in the bucket's default storage class.  This will not be optimal on
  Google Nearline / Coldline because of the delete penalty.

  The first backup with this release will send one or more hb.db.N
  files and then will delete all of the old ones; don't panic - this
  is expected.  If you notice any changes that are not beneficial,
  please send an email with details (ls -l backupdir/hb.db* and hb
  dest ls).

- the Google Storage service supports 4 storage classes:

     multi-regional  regional  nearline  coldline

  Unfortunately, HashBackup isn't able to dynamically set the storage
  class with Google Storage like it does with Amazon S3, maybe because
  HB is using the S3 interface.  Instead, the storage class is
  determined by the default storage class on the bucket.  The downside
  is that Nearline and Coldline have 30/90 day "delete penalties" that
  HashBackup can't avoid like it often can for Amazon S3 by managing
  the storage class on individual files.

  The delete penalty can be fairly severe with Google Coldline
  storage.  With Coldline, you get a 65% discount from regional
  prices, or 30% discount from Nearline, but all files are charged for
  at least 90 days of storage.  If a file is added one day and deleted
  the next, you're charged 90x more than the one day of storage.
  Therefore Coldline is not recommended for use with HashBackup unless
  you retain all files for a minimum of 90 days.  It can be very
  difficult to compare pricing with delete penalties, and the only way
  to know for sure is to run parallel backups to separate buckets for
  a month or so to see which is more cost effective for your backup
  data's access patterns.

- rekey (and export) is 40% faster to improve scalability

- rekey aborts if any destinations halt, to prevent a failed
  destination having files associated with the old key.

- rekey was not restoring the old key file on errors.

- recover: following a recover, the next backup would send a large db
  update that was unnecessary.

- recover: reconstructing the database is faster: around 35% faster
  for the HashBackup build server backup.

- recover: the Glacier download pacing code - half of the code in
  recover - was removed since Glacier is no longer supported.

- recover: add retry on dest.db download errors 

- recover: the -c option is now required.  If the backup directory
  doesn't exist, recover offers to create it.

- an empty key file caused a traceback but now displays an error

- recover warned about overwriting an existing database only when the
  -a option was used.  But recover always overwrites the database, so
  -this warning should always be issued.

- recover: display a progress message while downloading arc files

- backup is 5-6% faster when the backup is primarily small files

- backup: a new Mem: statistic shows peak memory used.  This includes
  the two largest RAM uses: the dedup table and database cache.

- backup: if a sparse file hole was greater than 2GB, HB could fail
  with a traceback: Exception in shaq_loop.  Thanks Mark!

- backup: if a sparse file was only a hole with no data, it was backed
  up normally and could take a long time instead of just a few
  seconds.  Thanks David!

- the config option cache-size-limit controls the amount of arc data
  kept in the local backup directory.  Setting it to zero means you
  don't want any arc data stored locally, but that can cause backup
  performance problems.  The cache size is now raise to at least 2 *
  arc-size-limit (2 arc files) while HashBackup is running, then
  trimmed lower if necessary when it's finished.  This makes a cache
  limit of zero work much better.

- Amazon S3: the region list was updated, adding us-east-2, eu-west-2,
  us-south-1, and us-northeast-2

- S3: the boto library used by HashBackup to access S3 was updated to
  the latest version

- S3: for large files, HB uses multipart uploads: instead of uploading
  a 100MB file with 1 worker, it might be uploaded by 4 workers in
  25MB sections.  But if a multipart upload is interrupted, the
  partial uploads hang around indefinitely on S3 unless you have a
  "lifecycle policy" for them, and you are billed every month for
  storage costs.  These files don't show up in the online S3
  Management Console.  Now HB deletes any incomplete uploads every
  time it starts.  There may be quite a few of these the first time.

- S3: the IBM / SoftLayer Standard Cross Region storage service has an
  S3-compatible API that works with HashBackup.  This is just a
  documentation change, not a code change.  Thanks Jonathan for
  testing and sharing your dest.conf info!

- get: if a file is only partially restored (disk full for example),
  the .hberror extension is added to the filename to make it obvious
  that the file is incomplete

- dest verify/sync: if dest.db was missing on a remote, it was not
  transferred.  Now it is always transmitted on a dest sync or verify.

#1781 Feb 15, 2017 - expires Apr 15, 2017


* dest.conf bug fix (keyword requires a value)
* rclone workarounds
* dest verify with rclone is 10-15x faster
* install new script!


- dest.conf: if a required keyword did not have a value, a traceback
  would occur (TypeError: not all arguments converted during string
  formatting) rather than the correct error message, "keyword requires
  a value: (keyword)".  Reported via email by HB traceback reporter.

- rclone has a few issues that require workarounds in HB and have been
  filed on GitHub.

- hb dest verify does one ls command now instead of one per remote
  file and is 10-15x faster.  The output for rclone is more detailed
  now, with separate errors for "file not found" and "file size is
  wrong".  Checksums are not supported on shell destinations.

- the updated script must be installed manually by copying
  it from doc/dest.conf.examples to where the run command in dest.conf
  needs it to be.

#1779 Feb 9, 2017 - expires Apr 15, 2017


* rclone destinations support "hb dest verify" command


- the hb "dest verify" command verifies the presence and size of files
  on a destination without downloading the file.  This is now
  supported by the rclone shell destination.  For this to work, the HB
  program has to be upgraded with hb upgrade *and* the
  script has to be updated manually.  The script is in
  doc/dest.conf.examples/ in the tar file on the HB website.

#1778 Feb 8, 2017 - expires Apr 15, 2017


* get: restore planning is 30% faster
* get: restores symlink modification time
* get: display progress while creating plan
* get: fix hang with multiple download workers
* get: fix hang after restore errors


- get: creating a restore plan, used when cache-size-limit set, is 30%

- get: reset symlink modification time to the value stored in the
  backup rather than the time it was restored.  Some older OSs cannot
  set symlink modification times; in that case, symlinks will have the
  time they were restored.  Thanks Roy!

- get: displays progress while creating a restore plan when
  cache-size-limit is set.  Thanks to Roy for the suggestion.

- get: when cache-size-limit is set and the restore size is greater
  than cache-size-limit, a race condition could cause the restore to
  hang if multiple download workers were active.  Thanks to Roy
  exporting his 3TB backup for testing!

- get: if errors occurred while restoring files, it could case a hang.
  Now it works, even with 50% injected random errors on a 3TB restore.

#1762 Jan 30, 2017 - expires Apr 15, 2017


* rclone: updated script
* get: re-enable multiple download threads


- the script wasn't working with the mount command.  It is
  located in doc/dest.conf.examples of the tar file on and has to be updated manually.

- get: previously, concurrent downloads were disabled to avoid
  splitting bandwidth resources across multiple files.  This works
  well for low-latency storage services, but is not so great for
  high-latency services where it takes a while to get a download
  started.  For now, concurrent downloads are re-enabled.  To do this
  right, HB needs to adjust downloads dynamically.

#1761 Jan 23, 2017 - expires Apr 15, 2017


* backup: ssh destination initialization fix
* rclone: updated script
* backup: compensate for getcwd() bug on Illumos in LX zone
* backup: fix a very rare race condition causing traceback


- backup: with some ssh servers, ssh destinations had trouble
  initializing, displaying 3 errors about DESTID and then disabling
  the destination.

- rclone: HashBackup can use rclone to communicate with storage
  services not directly supported by HB.  The script to
  enable this was updated to use the rclone copyto command.  This
  makes downloads more efficient because copyto doesn't do remote
  directory listings.  The script was also changed to do unconditional
  transfers because rclone sometimes thinks a remote file doesn't
  need updating when it actually does (same size file on a remote that
  doesn't support checksums, like Dropbox).  Because of this change,
  the --verify option had to be eliminated to avoid transfer loops
  related to "eventual consistency" on many remote storage services
  (files are not necessarily immediately available after an upload).

  The new script is in doc/dest.conf.examples and must be
  installed manually.  Rclone will be built-in to HashBackup in the
  next release to avoid having to do a manual script update: it will
  get updated with hb upgrade just like the rest of HB.

- backup: HashBackup will run in an LX zone on Illumos, a descendant
  of Solaris.  LX zones emulate Linux under Illumos.  When the dedup
  table is resized during a backup, getcwd() is called.  There is a
  bug in LX zones causing getcwd() to fail with "No such file or
  directory" and backup can't finish.  As a workaround, the getcwd()
  call was removed.

- backup: in very rare circumstances that happened to be triggered by
  the previous Illumos bug, a race condition combined with a nested
  exception could cause a traceback:

    Exception in thread shaq_loop: unsupported operand type(s) for +:
        'int' and 'NoneType'
    Exception in thread shaq_loop: an integer is required

  It took a week to figure out why this was happening... Ugh!

#1751 Dec 30, 2016 - expires Apr 15, 2017


* backup: sometimes created oversize arc files
* backup: multi-thread performance improved 10-15%
* b2: increase internal retries
* rare selftest bug fix


- backup: multi-threaded backup of a series of medium-sized files,
  especially if already compressed, was sometimes ignoring the signal
  to start a new arc file.  This could create arc files much larger
  than arc-size-limit, like 200MB-8GB instead of 100MB.  This bug
  started in August.  Now arc sizes should be much more controlled.

- backup: multi-threaded performance has improved 10-15% for some

- b2: the Backblaze B2 driver has a small internal retry loop in
  addition to the outer retry controlled by the retry dest.conf
  keyword.  The internal retry loop now tries 7 times instead of 3.

- selftest: in very rare cases, selftest could display an error:
     Error: block xxx arcdel yyy blen zzz
  This was a bug in selftest -- the backup is fine.  Thanks Evan!

#1747 Dec 23, 2016 - expires Apr 15, 2017


* B2: fix connection reset by peer, again
* B2: add Content-Length header on B2


- B2: fix "connection reset by peer" errors, for real this time.  The
  file size limit on B2 is 5,000,000,000 bytes, not 5GiB.  This error
  only occurs on large initial backups, like 6TB.

- B2 the Content-Length header is documented as required, so now
  HashBackup sends it (even though it seems to work fine without it).

#1742 Nov 30, 2016 - expires Apr 15, 2017


* bump expiration date for backup command to April 15, 2017
* cacerts.crt: fix problem with B2 on BSD


- b2: beginning with #1715 around Nov 7th, HashBackup could not
  connect to B2 from BSD systems because of this error:

    [SSLError] [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)

  In the backup directory, cacerts.crt is the root certificate file
  HashBackup uses to verify SSL connections.  On Nov 7th it was
  updated from and this update broke SSL connections to B2,
  but only on BSD; Linux and MacOS worked fine.  Apparently a limit on
  the number of certificates was exceeded.  Thanks Mark for reporting
  the problem.

#1738 Nov 27, 2016 - expires Jan 15, 2017


* backup: 10-15% faster (disable compression verification)
* backup: fix traceback when all backup data was removed
* rm: don't keep backup file when compressing database
* backup: --maxtime accuracy improved
* backup: --maxtime sometimes didn't backup anything


- backup: when the new compression code was launched in August, all
  data was verified to decompress correctly before being written to
  the backup.  It has been nearly 4 months without an error traceback,
  so verification has been turned off for a 10-15% performance gain.

- backup: if all backup data was removed with rm, a traceback could
  occur on the next backup (from internal testing):
    TypeError: unsupported operand type(s) for +: 'NoneType' and 'int'

- rm: when a lot of data is removed from a backup, rm compresses the
  database.  This was rewritten a couple of months ago and as a
  precaution, a backup of the original database was saved as
  hb.db.orig.  This file can be fairly large and no problems have been
  reported, so in this release, the backup file is kept during
  compression and then deleted.

- backup: --maxtime sets a time limit for the backup.  It's not very
  accurate for technical reasons and tends to run over the limit.  In
  this release the overrun is less than in previous releases.

- backup: when inex.conf was added to every backup a few months ago,
  it broke the --maxtime restart feature in some cases.  One backup
  would work, the next would be empty, the next would work (but wasn't
  a true restart), etc.

#1735 Nov 22, 2016 - expires Jan 15, 2017


* backup: don't create huge arc files with -p0
* backup: suppress some blocks size change messages


- backup: with -p0, backup would sometimes create huge arc files.  One
  arc file in a customer's backup was 21GB, even though arc-size-limit
  was 100MB.

  The backup is fine, but if you have this situation it is recommended
  that you either remove backups in reverse order starting with the
  last (most recent) backup and going back to the first backup where
  these huge arc files occur; or start a new backup and keep this
  backup until you no longer need it for retention purposes.

  If necessary, you can continue to use an existing backup with huge
  arc files, but it will be inefficient because HashBackup will not
  want to pack the huge arc files until 50% of the data is removed by
  retain.  Apologies for this one, and thanks Lourens for finding it!

- backup: if a backup block size change is normal, don't display the
  warning message

#1733 Nov 20, 2016 - expires Jan 15, 2017


* ls -a bug: displays a -> b -> c -> d for symlinks
* dest verify: bug fix for rsync destinations
* backup: --maxtime bug fix
* backup: detect block size change


- ls: with the -a option, a symlink or LV snapshot backup with
  multiple versions displayed an additional -> symbol on each version.
  Thanks Emanuele!

- dest verify: some rsync servers caused an error "unexpected rsync
  output" during a dest verify command.  Thanks Soren!

- backup: if --maxtime was specified and the time ran out, the backup
  stopped (correct) but lots of pending uploads continued to be
  transmitted (incorrect).  Now backup completes uploads that are in
  progress and stops much quicker.  Thanks Robert!

- backup: if the backup block size changes from the previous backup, a
  full backup warning message is displayed for affected files.

#1725 Nov 12, 2016 - expires Jan 15, 2017


* dest verify: more conservative about failures
* dest verify: accept WebDAV status 204
* WebDAV: more secure authentication over http
* selftest bug: sometimes did not detect incorrect file references


- if dest verify encountered any kind of error while trying to verify
  a file, it marked the file as "not transmitted", forcing it to be
  sent again.  This isn't correct behavior if a destination was only
  temporarily unavailable but did have the files.  Now verify will
  only mark files for retransmission when it gets a response from the
  remote indicating files are not there or are the wrong size.

- WebDAV: some WebDAV servers (4shared) return a status of 204 when
  checking if a file exists.  From internal testing.

- WebDAV: when the secure keyword is omitted, try digest
  authentication before regular authentication.

#1715 Nov 6, 2016 - expires Jan 15, 2017


* selftest --fix: ask to remove missing arc files
* dest verify: better verification on rsync destinations


- selftest: if an arc file goes missing, for example, it is deleted by
  accident, selftest printed an error that the file did not exist
  locally nor on any destination.  Now, if --fix is used and selftest
  is running interactively, it will ask whether to remove the missing
  arc file from the backup.  This will remove all blocks in the arc
  file, then all files referencing those blocks.

- dest verify can now verify the file size on rsync destinations even
  when cache-size-limit is set and an arc file isn't present locally.
  Previously it could only verify that the remote file existed.

#1709 Nov 3, 2016 - expires Jan 15, 2017


* database upgrade to dbrev 23
* backup, rm, retain: generating database updates is 45% faster
* b2: better error on file not found
* b2: better handling for Dir changes in dest.conf
* ssh: better error handling when fetching DESTID
* ssh: cleanup .stdin and .stdout temp files
* dest verify: don't verify files on inactive destinations
* dest verify: trouble with rsync and cache-size-limit
* HB could halt on errors while removing temp files


- this release will do an automatic database upgrade to dbrev 23 when
  any HB command is used.  Once the database has been upgraded,
  previous versions of HB cannot be used with it.  This database
  upgrade has two purposes:
  1) delete zero-length arc files created by a bug in dest verify for
     rsync destinations with cache-size-limit set (see below)

  2) prevent old releases from doing a recover since they would not
     understand the new remote database format (see next change)

- after every command that modifies the backup, HB has to send a
  database update to all remotes.  When there are lots of changes,
  creating this database update is up to 45% faster.

- b2: if a file could not be found when downloading, the filename
  wasn't included in the error message.

- b2: if the Dir keyword is changed on a destination, the next access
  causes a warning about ID mismatches.  Running dest setid fixes the
  error.  But if there were already backups to the old Dir, then
  backups are made to the new Dir, the backup files are split across 2
  directories on B2.  This is very confusing.  Now, if you try to
  download files in this state, HB will display a warning that the
  file is stored in the wrong directory.  It will still retrieve the
  file in case it's the only copy.  Thanks Robert!

  Related to this, dest verify would verify all files, even if stored
  in an unexpected directory.  Now it will complain about files that
  are in the wrong directory and upload them to the right directory if
  there is another copy available.  Thanks Robert!

- ssh: the DESTID file uniquely identifies remote storage areas.  If
  an ssh protocol error occurred while fetcing the DESTID file, HB
  would complain that the destination ID did not match - a misleading
  error message.  Now the ssh protocol error message is displayed
  instead.  Thanks Robert!

- ssh: temporary files ending in .stdin and .stdout are created by the
  ssh destination.  Usually they are deleted, but if something goes
  wrong, they hang around.  Now they have a .tmp suffix so HB will
  delete them on the next command rather than letting them accumulate.

- dest verify was trying to verify files stored on inactive
  destinations, causing a traceback.  Internal testing.

- dest verify on rsync destinations with cache-size-limit set >= 0
  would create zero-length arc files in the main backup directory for
  any arc files that were only remote, then give a "no such file or
  directory" error.  It didn't hurt anything, but also didn't verify
  the remote file and left empty arc files in the backup directory.
  Thanks Marcin!

- if HB had a problem cleaning up temporary files, it could abort with
  a traceback:
   OSError: [Errno 1] Operation not permitted: 'backupdir/hb-xxx.tmp'
  Now it displays an error message and continues.

#1698 Oct 23, 2016 - expires Jan 15, 2017


* new subcommand "verify" for hb dest
* recover: fix errors downloading segmented files
* fix: sometimes not deleting .tmp files
* b2: fix "almost dashes" in bucket names


- HashBackup assumes that once it stores a file on a remote
  destination, the file stays there until HashBackup deletes it.  This
  is usually true, but it can happen that files are manually deleted
  on a destination, and HashBackup gets no notice of this.

  Selftest -v4 verifies backup data by downloading it from all
  destinations, decrypting, decompressing, and verifying the hash of
  every block, and has an incremental option for huge backups that
  cannot be verified in a single run.  It's a very thorough check, but
  also very time-consuming.

  In this release, a new hb dest subcommand, "verify", will do a much
  quicker test of remote backup data without downloading it:

  $ hb dest -c backupdir verify

  This "best-effort" validation of remote files includes some or all
  of these checks:

  * check that all files actually exist on every destination
  * check that file sizes are correct
  * verify the checksum for destinations that support it

  Some remote storage like S3, Google Storage, and B2 are able to do
  all of these checks, while other destination types may only perform
  one or two checks.  All destinations support the new verify command
  except the "shell" destination (for user scripting).

  If a file fails to verify on a destination and there are copies
  elsewhere, it is marked missing on that destination and will be
  re-uploaded during the verify command.

- the dest.conf keyword "maxsize" is used to limit the size of files
  uploaded to remotes.  If a file to upload is larger than maxsize, HB
  splits it into parts before uploading them.  This is different than
  multipart or "large file" uploads - features supported by some

  Running recover to retrieve files that had been split would
  sometimes cause bizarre errors like checksum errors (S3/Google
  Storage), hash mismatches (B2), file size mismatches, arc version
  errors, and probably others.  The remote backup data was fine.

  This recover bug has been fixed.  The only time it would be seen in
  practice is for a large backup that created an hb.db.0 file larger
  than 5GB, meaning hb.db is probably bigger than 10GB.  This would be
  a backup of 30M+ files.  The error would only occur in the recover
  command.  From internal testing.

- sometimes HB didn't remove old .tmp files when it started

- b2: when the Backblaze B2 website displays bucket names it displays
  funky "dash-like" characters. Visually they look fine, but if cut
  and pasted into a dest.conf file, the dash is not a real dash and
  HashBackup complained that the bucket name is invalid.  Now HB
  changes the funky dash into a real dash.  A bug was filed with
  Backblaze.  From email traceback.

#1676 Oct 15, 2016 - expires Jan 15, 2017

- s3: fix traceback "No module named _strptime" caused by removing
  Glacier code

#1675 Oct 12, 2016 - expires Jan 15, 2017


* IMPORTANT SECURITY CHANGE: ls permission checks removed
* IMPORTANT SECURITY CHANGE: get permission checks removed
* ls: -d replaces --alldirs, -1 (one) replaces --onerev
* ls: only 1 pathname pattern on command line
* ls: pathname matching changed: no automatic wildcards
* ls: up to 27% faster for patterns matching many files
* ls: pathnames patterns beginning with / are faster
* ls: -d (was --alldirs) always sets -a
* ls: find personality disorder
* selftest: faster and uses less memory
* selftest: -v4 transferring same arc file
* selftest: --inc could sometimes check all archives
* Glacier removed from HashBackup


- IMPORTANT SECURITY CHANGE: ls tried to emulate Unix permission
  checking before listing files in the backup.  There were edge cases
  where it was either too permissive or too restrictive, it didn't
  handle ACLs at all, and it's possible that permission checking
  varies slight from one platform to another.  Rather than having a
  false sense of security, ALL of the permission checking in ls has
  been removed.  Now, anyone with read access to your HashBackup
  backup directory can list the metadata (permissions, filenames, file
  sizes, ...) for all files in the backup.  To secure your backup,
  secure your HashBackup directory with Unix permissions or ACLs.

- IMPORTANT SECURITY CHANGE: get tried to emulate Unix permissions,
  like ls, and that has also been removed in this version (see above).
  Now, anyone with read access to your HashBackup backup directory can
  restore any file in the backup, even if they do not have read
  permission to the file in the live filesystem.  To secure your
  backup, secure your HashBackup directory with Unix permissions or

- ls: the -d and -1 (one) options are replacing the --alldirs and
  --onerev options.  This could break scripts, but it's unlikely the
  hb ls command is used with a script and seems a low risk.

- ls: ls accepted more than 1 pathname on the command line.  ls always
  requires that wildcards be quoted, to prevent Unix from expanding
  them before ls executes, and it's extremely confusing if an unquoted
  wildcard is used with ls.  Now, ls only accepts 1 pathname on the
  command line - probably the most common usage - and if an unquoted
  wildcard is used and the Unix shell expands it, ls can display an
  error message about unrecognized arguments.

- ls: ls added a * wildcard to the beginning and end of the match
  pattern on the command line.  This caused ls to match any path in
  the backup containing the match pattern.  For example, the pattern
  def would match /def, /abcdef, /defghi, and /abcdefghi.  This made
  it impossible to match a specific filename.  Now, ls does not add
  wildcards to the pattern, so pattern def matches /def, /abc/def, or
  /abc/def/ghi, but not /abc/defxxx.  If you want the old behavior,
  use the pattern '*def*' instead.  Remember that all wildcard
  patterns must be quoted with ls to prevent the Unix shell from
  expanding them; this hasn't changed.

- ls: ls is up to 27% faster if the pathname pattern matches a lot of
  files in the backup.

- ls: with a very long pathname (more than 8 pathname components), ls
  would sometimes do a sequential search.  Now if the pathname begins
  with /, ls is much less likely to do a sequential search.

- ls: -d (was --alldirs) means to show all versions of directory
  entries.  This requires -a.  An error was displayed if -a wasn't
  used, but now -a is implied.

- ls: ls behaved more like the Unix find command, in that it listed
  all files underneath directories.  This is useful when you're
  looking for a file but don't know where it is in the filesystem, but
  it was impossible to list only the first-level files of a directory.
  Now ls behaves more like the Unix ls command and only lists 1 level
  of a directory by default.  If you want to see all files under a
  directory, add /* to the pathname pattern and be sure to quote it!

- selftest: 10% faster and peak memory use reduced by half

- selftest: with -v4, selftest could sometimes pack an arc file and
  send it to destinations on every run.

- selftest: --inc could sometimes check all archives.  From internal

- For the past 10 months there have been notices about withdrawing
  Glacier support, with a transition plan to other storage services
  such as Backblaze B2, Google Nearline, and Amazon S3 Infrequent
  Access, all priced similarly to Glacier but with simple retrieval
  policies and costs.  Glacier is removed in this release.

#1637 Sep 20, 2016 - expires Jan 15, 2017


* bumped expiration date to Jan 15th 2017
* Glacier no longer supported
* scaling improvements for small and large backups
* rm: compressing database is 33% faster
* get: show filesystem free space on write error
* get: abort restore and set exit code on write error vs hanging
* get: fix error setting flags on dangling symlinks (Linux)
* B2: update domain name to avoid redirects
* selftest: bug fix with --inc on empty backup
* S3: display correct error message when host/port are wrong


- as mentioned since January 2016, Amazon Glacier is no longer
  supported in HashBackup.  The code is still there but will be
  removed soon.  If you still have HashBackup data stored in Glacier,
  save this release so you can access your data in the future.

  For details see:

- one of HashBackup's caches was a fixed size that was reasonable for
  backups of several million files.  For larger backups, HB relied on
  OS filesystem buffering.  Usually that's fine, but sometimes it's
  not if the system is under memory pressure.  Now this cache scales
  with the size of the backup: smaller backups use less memory than
  before and very large backups use more.

  Example: a customer with a backup of 12M files totalling 7TB had
  never run retain in 6 years of daily backups.  The hb.db file was
  around 9GB.  On a 16GB system, the previous version of HB retain
  took 131 minutes to remove 7M files from the backup.  The new
  version takes 94 minutes - a 28% improvement - and uses about 1GB of
  memory. 30 minutes of the time in both cases is to compress the
  database because so many files were removed (this is now reduced to
  20 minutes thanks to the next change).  The next retain removed no
  files, took 5 minutes, and used 287MB of memory because the database
  had shrunk from 9GB to 4GB.

- rm: if enough data is removed from a backup, HashBackup will
  compress the hb.db file.  This is now 33% faster.

- get: if a write error occurred while restoring files, the error was:
  "Tried to write X bytes, could only write Y bytes".  Now it also
  displays how much free space the filesystem has since a full disk is
  usually the real problem.

- get: after displaying a disk write error message, the get command
  would hang.  Now it aborts the restore and sets the exit code.

- get: on OSX and BSD, symbolic links can have flags.  When restored,
  flags were being set on the target rather than the symlink.  Related
  to this, if the target did not exist (a "dangling" symlink), a
  restore error occurred: NameError: global name 'FS_SECRM_FL' is not
  defined. The restore continued but flags were not set for the
  symlink.  From internal testing.

- B2: changed the domain name for B2 access to avoid redirects

- selftest: if a backup had no data stored, for example, only
  /dev/null or empty files were backed up or rm / was executed,
  selftest -v4 --inc failed with a traceback.  Reported by email

- S3: when the host and/or port keywords are used but the host is not
  running an S3 service on that port, HB would fail with a traceback:

    Traceback (most recent call last):
      File "/", line 100, in <module>
      File "/", line 2651, in main
      File "/", line 331, in init
      File "/", line 253, in initdest
      File "/", line 186, in startdest
      File "/", line 251, in init1
    AttributeError: 'gaierror' object has no attribute 'status'
    AttributeError: 'error' object has no attribute 'status'

  Now it fails with the correct error:
    [gaierror] [Errno 8] nodename nor servname provided, or not known
  Reported by email traceback.

#1619 Aug 8, 2016 - expires Oct 15, 2016


* database upgrade to dbrev 22
* backup: trim executables in backup directory
* rm & retain combine small arc files
* backup: small is beautiful
* backup: -Z option compression levels
* backup: number of CPU cores used
* backup: compression verification enabled
* backup: display compression efficiency
* backup and restore benchmarks
* selftest 15% faster
* ls is 9x faster
* backup: check disk free space
* B2: 5GB maxsize default fixes connection resets on large backups
* minor fixes


- to support new features, this release will do an automatic database
  upgrade to dbrev 22 when any HB command is used.  Once the database
  has been upgraded, previous versions of HB cannot be used.  Since
  this release has quite a few changes, you may want to try the new
  release with a small backup first before using it with large
  production backups.

- backup: the HashBackup executable is copied to the backup directory as
  before, but now only the latest 3 versions are kept.  Older versions
  are automatically removed during the next backup.

- over time, a "forever incremental" backup strategy can lead to small
  arc (backup) files as older data blocks are replaced by data from
  newer backups and old arc files are packed to remove empty space.

  In this release, rm & retain combine small arc files into larger arc
  files.  This can drastically reduce the number of arc files in the
  backup directory, especially for backups with many versions.  For
  example, HashBackup's 7-year-old build server backup had 1841
  versions and around 1890 arc files.  Combining eliminated 2/3rds of
  the arc files, leaving 690, and also gave a 4-5% restore performance
  increase when restoring a very large directory.

  For now the parameters on this feature are fixed.  To combine arcs:
  - arc files must be local, OR
  - pack-remote-archives must be true for remote arc files
  - arc files must be at least pack-age-days old
  - some small arc files cannot be combined for technical reasons
  - the rm and retain commands both do arc combining

- most of the changes in this release are compression enhancements.

  Deduplication is a great tool for reducing backup sizes, especially
  for certain kinds of files like VM images, log and other
  "append-mostly" files, SQL database dumps, general "office" docs,
  and files moving around in the filesystem.  But compression plays a
  big part to save backup space.

  The goal of this release is to get better compression when possible,
  without losing performance.  In most cases, this release achieves
  better compression AND better performance.

- backup: -Z0 to -Z9 are hints to HashBackup of how much compression
  to apply.  For this release:

    -Z0            = no compression (usually slower)
    -Z1, -Z2    = fastest compression
    -Z2 to -Z9  = better compression

  In this release, -Z6 is the default and -Z7-Z9 are identical to -Z6,
  so the only use for -Z is to use lower levels to get slightly faster
  but less compression.  Of course the new version is compatible with
  and can restore all existing backups.  Most backups will get a speed
  boost and/or compress better without any command line changes.

- Previously, with no -p option, HB would use:

     one core on a single-core system (just the main thread)
     two cores on a dual-core system
     two cores for -Z7 or less (required -p to use more)
     all cores for -Z8 or -Z9

  Now, with no -p option, HB will use:

     one core on a single-core system (just the main thread)
     up to 4 cores on a multi-core system (requires -p to use more)

  It probably does not make sense to use -p4 or higher with the new
  compression technologies because they are much faster and can
  usually only keep 3 cores busy.  You can use the new %CPU statistic
  to see if more cores will increase performance.  For example, if
  %CPU is over 210% with -p2, try -p3.  If CPU is 220% with -p3 (less
  than 300%), it likely means that more cores (-p4) is not going to
  increase performance because HB is not keeping 3 cores busy.

- some of the compression technologies in HashBackup are rather new.
  To ensure that all data can be restored, compression verification is
  enabled.  During backups, HB will expand compressed data and verify
  it against the original before saving it to your backup.  This is
  only for the newer compression technologies (lz4, zstd, brotli,
  lzma).  It's a temporary safety measure that will slow down backups
  a bit for now, but will be removed in the future.

- to help evaluate performance, backup displays a new statistic:
  Efficiency.  It is "MB reduced per CPU second", so higher numbers
  mean better efficiency.  This makes it very easy to compare backup
  options by showing how much extra CPU time is being spent to reduce
  backup data size.

  The efficiency rating measures overall HashBackup efficiency, so it
  can also be used to compare compression, block sizes and dedup
  options.  It's best to change one option at a time when comparing
  different efficiencies, to understand how that option affects

- no discussion of compression would be complete without benchmarks.
  Here are a few, comparing the old and new version of HashBackup with
  different kinds of backups.  All of these tests were run on a 2010
  Macbook Pro (Intel Core2Duo, 2.5GHz) with a solid state drive.

  Test #1: backup /Applications (212K files) with dedup, default -Z option

    Results: backup is 20% faster, uses 25% less CPU, and is 7% smaller

             old HB:  6m 8s real time
                      6m 55s CPU time
                      2.257568 GB backup size

             new HB:  4m 56s real time (20% faster)
                      5m 11s CPU time  (25% less)
                      2.089324 GB backup size  (7% smaller)

  Test #2: backup 2GB Ubuntu VM image with dedup -B4k (block size)

    Results: backups are 35-86% faster, use less CPU, and are 9-12% smaller
            (% savings of new version vs old version shown in parentheses)

                        Time        CPU           Size
    old HB, -Z1          73s        126s         955 MB
    old HB, default -Z   78s        135s         943 MB
    old HB, -Z9         339s        665s         976 MB

    new HB, -Z1          45s (38%)   70s (44%)   1.0 GB (-5%)
    new HB, default -Z   47s (40%)   76s (44%)   858 MB (9%)
    new HB vs old -Z9    47s (86%)   76s (88%)   858 MB (12%)

  Test #3: backup 2GB Ubuntu VM image with fastest options: dedup, -B1M, -Z1

    Results: backups are twice as fast and up to 3% smaller

                        Time        CPU           Size
    old HB, -B1M -Z1     49s        88s          935 MB
    new HB, -B1M -Z1     23s        36s          1.1 GB
    new HB, -B1M no -Z   26s        45s          903 MB

  Restore tests for the backups above:

  Test #1: restore /Applications (212K files), default -Z:

       old:  3m 29s real time
       new:  3m 11s real time (13% faster to restore)

  Test #2: restore 2GB VM image, default -Z:

       old:  1m 5s real time
       new:  0m 54s real time (17% faster to restore)

  Test #3: restore 2GB VM image, fastest options: -B1M -Z1

       old:  0m 33s real time
       new:  0m 26s real time (21% faster to restore)

- selftest: the basic selftest check is 15% faster on a backup with
  13M blocks and 110M block references - lots of blocks and lots of
  dedup.  Making selftest more efficient is not about speed; it's an
  ongoing process to scale HashBackup to handle ever-larger backups.

- ls: the basic ls command to list all pathnames in a backup is 9x

- backup: to avoid filling the backup directory disk, backup will
  abort if there is not enough space for 2 arc files of size
  arc-size-limit (hb config option).

- B2 has a 5GB upload limit and connection resets occur when exceeded.
  Large backups can generate an hb.db.0 file that is bigger than 5GB.
  To prevent connection reset errors, the default for maxsize (a
  dest.conf keyword) is now 5GB.  This forces HashBackup to segment
  very large files.

- selftest: selftest displays a progress indicator when it is run
  interactively.  In certain circumstances with -v3 and -v4, the
  percentage would go over 100%.  From internal testing.

- stats: because of a previous bug in rm, the stats command could
  display a negative number of files.  This is fixed.  Thanks Soren!

- rm: if a problem occurred while removing a data block, rm noted the
  error and continued but it could cause a selftest error later.  From
  internal testing.

- backup: the inex.conf file is now included (encrypted) in every
  backup.  Previously, after a recover command, the inex.conf file was
  missing.  The next backup would create a new default inex, but any
  site changes would be lost.  Recover does not automatically restore
  inex.conf, but displays a get command to do it if it's appropriate.

- get: if cache-size-limit was set >= 0, one CPU could get stuck in a
  loop.  It didn't affect the restore, but it did run slower.  This
  would have been especially noticeable on 1 or 2 core systems.  From
  internal testing.

- get: another case of the race condition (multi-thread bug) that was
  fixed in #1481, was fixed in this release.  If an error occurred
  during the restore, the get command might do any of these:
  - give a Bad file descriptor error
  - give a size mismatch error if restoring multiple files
  - hang
  This rare bug was reported by HB's email traceback on July 18th,
  though the bug has been present since February.  Thanks HB!

- selftest: after a full -v3/4 selftest, incremental selftest (--inc)
  always did a full selftest, downloading every arc file.  From
  internal testing.

#1538 Jul 2, 2016 - expires Oct 15, 2016


* get: restore planner traceback on sparse files
* rm/retain: caused incorrect stats output
* rm/retain: error message fix


- get: if cache-size-limit is set, the restore planner could fail with
  an error when a sparse file was being restored:
    TypeError: unsupported operand type(s) for >>: 'NoneType' and 'int'

- rm/stats: when rm removed a file, it was not adjusting a counter
  correctly, causing the stats command to sometimes display a negative
  number of files in the backup.  This fixes the cause, but the stats
  display won't be fixed until the database upgrade of the compression

- rm: if -v 2 was used with retain, it caused a traceback.  Retain -v
  doesn't accept an integer after it, so the 2 was interpreted as a
  pathname, and pathnames for retain have to start with slash.  So the
  error should have been "All pathnames must start with /: 2" But
  because of a bug in the error handler, it caused a traceback.

#1532 Jun 22, 2016 - expires Oct 15, 2016


* bump backup expiration date to October 15th
* rm/retain: prevent unused pathname selftest warning


- the compression enhancements are going well, but since it is near
  the end of a quarter, they're being released after July 15th instead
  of before.  To try them out, upgrade again after July 15th.

- rm/retain could sometimes cause an unused pathname selftest warning.
  If you are getting unused pathname warnings from selftest, run
  selftest --fix to correct them.

#1528 Jun 3, 2016 - expires Jul 15, 2016


* new backup option --no-ino


- backup: some filesystems don't have stable inode numbers: FUSE
  (sshfs, UnRaid), Samba/CIFS, and probably others.  HashBackup checks
  for inode number changes during backup, but because these
  filesystems have random inode numbers, it can cause unpredictable
  full backups.

  The new option --no-ino can be used with these filesystems to bypass
  the inode number check.  A new warning is displayed periodically
  when this option might be necessary:

    Warning: many unstable inode numbers; use --no-ino to avoid a full backup

  A negative side-effect of using --no-ino is that hard links cannot
  be detected because the definition of a hard link is two files with
  the same inode number.

#1525 May 25, 2016 - expires Jul 15, 2016


* new command "rename"
* backup compresses wav files
* S3 improved multipart error handling


- a new command, "rename", can be used to change the pathnames of
  files and directories in a backup.  This can be useful when
  filesystem paths change for some reason, and you want pathnames in
  an existing backup to match so that the renamed files aren't all
  backed up again.  If dedup is activated, this is not a big concern.
  But sometimes a renamed path contains so much data that you want to
  avoid reading it all again just to find out it hasn't changed.
  See hb help rename or the rename web page for more details.

- backup: wav files are no longer in the list of uncompressible file
  extensions since they can sometimes be compressed 10-20%, though
  other times only a few percent, depending on the file content.

- S3: when multipart upload is enabled (it is by default), HashBackup
  will do retries on each part, then if that fails, will do retries on
  the whole file.  This makes multipart error recovery more efficient
  and fixes a problem where an S3 timeout or broken pipe (S3 closed
  the connection) would cause backup to abort.  If configured for the
  default 3 attempts, HB will now try 9 times before giving up.

#1521 May 17, 2016 - expires Jul 15, 2016


* database upgrade to dbrev 20
* creation date is saved and restored on Mac and BSD
* new config option hfs-compress for Mac/OSX
* get: bug fix restoring hidden files on Mac/OSX
* Backup Bouncer test results posted on web site


- this release will do an automatic database upgrade to dbrev 20 on
  any HB command, to support saving and restoring creation date.

- on BSD and OSX, files and directories have a date created timestamp
  that HB didn't save or restore.  Now it does.  On Linux, some
  filesystems have a creation timestamp but there is no standard OS
  interface to get or set it, so it is still not supported by HB.
  Note: "ctime" is not date created, but is the last time the inode
  was changed (changed time, not creation time).

- Mac OSX uses the HFS+ filesystem.  HFS has the ability to store
  files compressed.  For example, /Applications/Address Book contains
  many compressed files.  HashBackup has always restored compressed
  files as uncompressed.  It works fine, but previously-compressed
  files would use more disk space.  Now, a new True/False config
  option "hfs-compress" can be set True to re-compress these files on
  restore.  hfs-compress defaults to False because unfortunately,
  restoring compressed files takes 10x longer.  The only standard
  software supplied by Apple to compress a file is the external ditto
  program, and running that for each compressed file is slow.  If you
  don't care about the slowdown and want compressed files to stay
  compressed on restore, set hfs-compress to True with hb config.

- get: on some filesystems, files and directories can be marked
  invisible.  HB was saving this but not restoring it.  Now it does.
  Found by Backup Bouncer.  Thanks Roy!

- Backup Bouncer is a backup test program that checks a lot of things
  on a Mac HFS+ filesystem.  With the two changes above, HashBackup
  passes the Backup Bouncer test.  The one caveat is that BB sets the
  nodump bit on files, so HB doesn't save them (that's what the nodump
  bit means), but BB expects the backup program to save them anyway.
  A full Backup Bouncer report is posted on the website.

#1514 May 11, 2016 - expires Jul 15, 2016


* new dest.conf keyword "onfail"
* Backblaze B2: Dir / bug
* Backblaze B2: changes for self-certify
* Backblaze B2: sanitize debug log
* Backblaze B2: add time & thread number to logs


- a new dest.conf keyword, onfail, adds flexible handling of
  destination failures, requested by several customers.

  If the onfail keyword is not used on a destination and it fails,
  HB will continue the backup as it does today.  What is different is
  that the destination failure will count as a backup error and the
  exit code will be non-zero.  HB should have done this all along.

  If "onfail stop" is used, HB will stop immediately if the
  destination fails.

  With "onfail ignore", HB will behave exactly as it does today: the
  backup will continue, no backup error is reported, and the exit
  status is not affected.  This will be necessary for example if you
  backup to 2 USB drives in rotation.  If you have cache-size-limit
  set >= 0 and the cache fills up because a destination has failed,
  backup will have to stop, just like today, even with onfail ignore.
  To be exactly compatible with previous releases, add onfail ignore.

- Backblaze B2: using Dir / in dest.conf causes an error:
    http status 400 (File names must not start with '/') listing file: /DESTID
  Thanks Niels!

- Backblaze B2: add Backblaze-recommended error handling for status
  403 (account limit), 429 (too many requests), Retry-After, broken
  pipe, and timeouts, so HashBackup can be self-certified.

- Backblaze B2: when the debug keyword is used, a B2 traffic log is
  written to the backup directory.  Previously the log contained B2
  credentials used to sign-in to B2.  Now credentials are replaced by
  'xxx' so that the log can be shared for troubleshooting purposes.
  Only Authorization and authorizationToken are replaced; accountId,
  fileName, fileId, bucketId, and uploadUrl are not replaced since
  they are not usable without authorization credentials and would make
  debugging more difficult.

- Backblaze B2: log lines begin with a time and thread number to help
  trace multiple workers' activity.  It's still recommended to set
  workers to 1 for debugging.

#1504 Apr 21, 2016 - expires Jul 15, 2016


* init bug fix


- init: hb init -c /xyz/backupdir would cause a traceback if the
  directory /xyz did not exist.  Now it says parent directory doesn't
  exist and exits cleanly.  From traceback email.

#1502 Apr 18, 2016 - expires Jul 15, 2016


* backup: sparse file bug fix


- backup: backing up large sparse files did not always work correctly.
  It worked with -p0 (single-threaded) but not multi-threaded.
  Sometimes a warning message about a size mismatch would occur during
  backup.  A selftest -v5 displayed a hash mismatch error.

  This release will do a database upgrade to mark all sparse files as
  partial backups.  The next backup will re-save the sparse files.
  After a good copy is saved, the next retain will remove all of the
  partial files.

#1499 Apr 12, 2016 - expires Jul 15, 2016


* add --verify option to
* clear command bug fix


- in #1498, an ls command was added after every rclone transfer to
  verify the send worked.  However, some storage systems like Amazon
  Cloud Drive may have a slight delay after an upload before the file
  is available.  When this happens, ls returns an error, HB retries
  the send, the file is usually there and is not sent again.  But a
  bogus error was displayed.

  Now, the ls after send is only done if --verify is used on the run
  command line in dest.conf.  This is highly recommended if you have
  cache-size-limit set: if something goes wrong, perhaps because of a
  bug in HB, rclone, or the storage system, and HB believes the file
  was transferred when it was not, HB will delete the local copy.
  Since there is no remote copy either, it breaks the backup.  A
  selftest -v4 -fix will correct this, but it will also remove files
  from the backup.

- apologies to the rclone developer: the 256 exit code was confusion
  about how Python worked, not an rclone problem.  Sorry!

- clear: if the clear command was interrupted, files on remotes could
  sometimes not be deleted by running clear again, because HB thought
  the files were already deleted.

#1498 Apr 11, 2016 - expires Jul 15, 2016


* and dest.conf.rclone


- the rclone build was missing the script and example dest.conf.rclone

- rclone sometimes returns exit codes 256 for error conditions, but
  exit codes have to be 0-255.  This version of checks and
  adjusts the exit code, and also does an ls command after every send
  to verify that the files are actually on the remote.

#1497 Apr 8, 2016 - expires Jul 15, 2016


* hb get progress stats bug


- get: for -v0/1, progress stats should not be displayed because
  filenames are not printed

#1496 Apr 6, 2016 - expires Jul 15, 2016


* Add shell destination and dest.conf.rclone


- Rclone is a free program that syncs directories to several cloud

    Google Drive
    Amazon S3
    Openstack Swift / Rackspace cloud files / Memset Memstore
    Google Cloud Storage
    Amazon Cloud Drive
    Microsoft One Drive
    Backblaze B2
    Yandex Disk

  HB has native support for many of these services, and that should be
  used when possible.  With this update, Rclone can be used to access
  services that HB doesn't support natively, such as Amazon Cloud
  Drive.  See doc/dest.conf.examples/dest.conf.rclone and
  for more details.  Thanks Ziv for the suggestion of using Rclone!

  IMPORTANT: Only Hubic and Backblaze were tested to develop the script. ACD has been tested by a user, but you are "on
  your own" when using HB with Rclone.  Questions are fine but support
  is limited for now, so this method is not recommended for production
  or critical backups.

  This release doesn't have any HB program changes; it just includes
  a new dest.conf example and the script.

#1493 Apr 1, 2016 - expires Jul 15, 2016


* Unix support has been removed & HB now runs only on Windows
* selftest could hang when cache-size-limit is set >= 0


- developing HashBackup on 5 different Unix platforms has become too
  much of a challenge, so going forward, HB will be released on these
  12 Windows platforms instead: Windows XP, Windows XP Pro, Windows
  Vista, Windows 7, Windows 8, Windows 8.1, Windows 10, Windows Server
  2008, Windows Server 2008 R2, Windows Server 2012, Windows Server
  2012 R2, and Windows Server 2016.  This will greatly simplify HB
  development.  The change takes effect today, April Fool's Day.

- selftest wasn't following cache protocols correctly, so if
  cache-size-limit was set and arc files needed to be downloaded,
  selftest could hang because it thought the cache was full

#1492 Mar 29, 2016 - expires Jul 15, 2016


* dir destination displays detailed info when fetching files if
  dest.conf debug keyword is non-zero to debug a customer site issue

#1491 Mar 24, 2016 - expires Jul 15, 2016


* a debug print was accidentally left in and is now removed

#1490 Mar 23, 2016 - expires Jul 15, 2016


* backup: handle read errors better
* backup: handle file size changes better
* init: add warning for -k env and -k ask


- backup: read errors are unusual these days, but they do happen.
  Backup didn't handle them well: it printed an I/O exception
  traceback and hung.  Now it shows an I/O error message with the
  pathname, stops backing up the file, marks it as a partial backup,
  and continues the backup.  From emailed exception report.

- backup: if a file changes size during backup, backup printed a
  warning message.  Now it re-checks the file to get the current size
  and if it matches what was saved, no warning is printed.  If it
  doesn't match what was saved, the file will be saved again on the
  next backup.  From internal testing.

- init: display a warning if -k ask or -k env are used, since these
  are for the -p option, not -k.  From internal testing.

#1487 Mar 13, 2016 - expires Jul 15, 2016


* #1483: bump backup expiration date for stable upgrade
* #1484: --maxtime fix and enhancement
* #1485: selftest --inc was sometimes incomplete
* #1487: off by 1 in selftest --inc


- backup: when --maxtime is used, backup creates a restart checkpoint
  if the time limit is exceeded.  If the next backup did not use
  --maxtime, it could still use the checkpoint data (a bug), and would
  set the checkpoint.  Now, only backups with --maxtime look at or
  set the checkpoint.  This allows running a nightly cron backup
  with --maxtime, and running other ad-hoc backups without --maxtime
  will not reset the checkpoint.  From internal testing.

- backup: related to --maxtime, if a backup aborted, even if --maxtime
  was never used, the next backup might not save anything. It was
  incorrectly restarting a checkpoint that wasn't setup properly.
  Thanks to Daniele at EURAC.

- selftest: with the --inc option (incremental selftest), depending on
  the options used and the size of arc files, selftest might not always
  cycle through all of the archives before cycling back to the first.
  This bug was introduced in #1454.  From internal testing.

- selftest: because of round-off error, selftest --inc would sometimes
  require an extra run to test all archives.  From internal testing.

#1481 Feb 20, 2016 - expires Apr 15, 2016


* minor fixes


- backup: sparse file mapping is 30% faster on Linux

- backup: since #1473, a fifo backup could get an Illegal seek error

- backup: when a small fifo or raw device was backed up, the Saved:
  number displayed at the end could be negative

- get: fixed a bug in the new restore code that caused this:
    File size mismatch, should be 288, is 0: <pathname>
    Exception in thread writeq_loop: [Errno 9] Bad file descriptor
    File "/", line 85, in start_thread
    File "/", line 168, in writeq_loop
  Thanks to Juha Pyy for reporting this.

#1479 Feb 14, 2016 - expires Apr 15, 2016


* get: performance improved
* recrypt and sha256 commands removed
* upgrade: failed if a transaction was pending
* backup: removed error messages on sparse files
* backup: sparse file bug with huge files (Linux)


- get: in #1473, the get command for all restores was about 2x slower
  because of the new sparse file handling.  Now it's back to its
  previous performance levels for most situations, and restoring files
  with a small block size (VM images) is about 30% faster than ever.
  Restores using bzip2 compression (-Z8 or -Z9) are still slower than
  before #1473.  Compression improvements are coming...

- the recrypt and sha256 commands have been removed.  HB no longer
  uses the SHA256 hash.  The recrypt command was seldom used and while
  running, left the backup in a precarious "half old key, half new
  key" state.  The rekey command, used to change the backup key, was
  not removed.

- if the previous HB command aborted and left an open transaction,
  upgrade would fail with a traceback:
    Exception: Can't upgrade database with a transaction active
  Thanks to Frank Riley.

- backup: if an OS doesn't support hole-skipping, error messages are
  no longer displayed.  The file is still backed up normally, just
  without skipping the holes.  On a restore, the holes of a sparse
  file are always created.  The Sparse: line at the end of the backup
  tells how many hole bytes were skipped (none if no Sparse: line).

- backup: some Linux filesystems return a partial sparse map under
  some circumstances, causing an bad sparse file backup.  There are
  now checks to detect this, but if you made critical sparse file
  backups with #1473, it's probably best to redo them.  Seems to
  happen mostly with very large (>4GB) files.  From internal testing.

#1473 Feb 10, 2016 - expires Apr 15, 2016


* database upgrade to dbrev 18
* backup: fast hole-skipping for sparse files
* get/selftest: could stall when cache-size-limit set
* export: clears file and block hashes
* backup: fix invalid argument error on VMware vmfs filesystems
* other minor changes


- this rev will do an automatic database upgrade to dbrev 18 when any
  HB command is used.  The database isn't actually modified, other
  than being stamped dbrev 18 to prevent older versions of HB from
  accessing the database.  There are new data structures created in
  this release for sparse files that older versions wouldn't

- backup: backup can skip "holes" (unallocated disk space) in sparse
  files rather than backing them up.  This is OS version dependent, as
  well as filesystem dependent, so it will work when it can and
  fallback to a regular backup when it can't.  Sparse files are mostly
  used for "thin provisioned" VM disk image files.

  Restoring large sparse files can be a bit slow and tedious because
  it may require many lseek() calls to re-create the original holes.
  Sparse files saved with this release will restore faster than with
  older versions of HB if the holes are large.

  Here's an example of backing up and restoring a 10GB sparse file:

    $ echo abc|dd of=sparsefile bs=1M seek=10000
    0+1 records in
    0+1 records out
    4 bytes (4 B) copied, 0.000209 seconds, 19.1 kB/s

    $ hb backup -c hb -D1g -B4k sparsefile
    HashBackup build #1463 Copyright 2009-2016 HashBackup, LLC
    Backup directory: /home/jim/hb
    Copied HB program to /home/jim/hb/hb#1463
    This is backup version: 0
    Dedup enabled, 0% of current, 0% of max

    Time: 1.0s
    Checked: 5 paths, 10485760004 bytes, 10 GB
    Saved: 5 paths, 4 bytes, 4 B
    Excluded: 0
    Sparse: 10485760000, 10 GB
    Dupbytes: 0
    Space: 64 B, 139 KB total
    No errors

    $ mv sparsefile sparsefile.bak

    $ time hb get -c hb `pwd`/sparsefile
    HashBackup build #1463 Copyright 2009-2016 HashBackup, LLC
    Backup directory: /home/jim/hb
    Most recent backup version: 0
    Restoring most recent version

    Restoring sparsefile to /home/jim
    Restored /home/jim/sparsefile to /home/jim/sparsefile
    No errors

    real    0m0.805s
    user    0m0.150s
    sys     0m0.100s

    $ ls -ls sparsefile*
    20 -rw-rw-r-- 1 jim jim 10485760004 Feb  8 18:03 sparsefile
    20 -rw-rw-r-- 1 jim jim 10485760004 Feb  8 18:03 sparsefile.bak

  Any -B block size (or none) can be used with sparse files.

- get/selftest: since #1454, if cache-size-limit was >= 0, get and
  selftest could stall in some situations.  Thanks to Max Norton at
  Aria Networks for reporting and helping with this.

- export: sets all block and file hashes in the exported database to
  spaces.  The purpose of export is to debug hard-to-reproduce HB
  problems.  These hashes aren't useful for debugging and the goal of
  export is to remove as much sensitive information as possible from
  the exported data while still being able to use it for bug hunting.

- backup: on Linux, VMWare vmfs filesystems could give an invalid
  argument error when trying to open files.  Thanks to Ron Joffe.

- init: display an error message if unable to set owner-only
  permissions for the backup directory rather than halting with a

#1460 Feb 7, 2016 - expires Apr 15, 2016

- B2: if the copy-executable config option was set to True, HB would
  try to copy the program to the B2 storage service.  But because of
  the # in the filename, it would cause a broken pipe error.

- compare: for hard links, compare -f (verify file hashes) might show
  a file's data had changed when it really had not.  From internal

- compare: always ignore link and size changes for directories since
  that doesn't mean much.  On Linux, if you create 10K files in a
  directory then delete them all, the directory will have a large size
  even though it is still empty. From internal testing.

#1456 Jan 24, 2016 - expires Apr 15, 2016

- backup: if linux-backup-attrs is set to True with hb config (the
  default is False), an error could occur:
    NameError: global name 'pathname' is not defined

#1454 Jan 21, 2016 - expires Apr 15, 2016


* IMPORTANT: HB Glacier support ending mid-2016
* database upgrade (again!) to dbrev 17
* backup: incremental backups improved 5-10%
* backup: raw and VM image backups 10% faster
* S3: support Infrequent Access storage class
* S3: fix sporadic broken pipe errors
* selftest --inc fix with small backups
* backup: file hash changed to SHA1
* get: bug in raw and VM restores when cache-size-limit set
* Backblaze B2 improvements
* get: didn't always respect cache limits on restore
* misc minor changes


- backup: sometime mid-2016, HashBackup will no longer support Glacier
  as a destination.  There are several good alternatives:

  * Amazon S3 Infrequent Access (1.2 cents/GB/mo)
  * Google Nearline Storage (1 cent/GB/mo)
  * Backblaze B2 (.5 cents/GB/mo)

  An email has been sent to any HashBackup customers that have written
  in explaining why Glacier support is ending and how to do the
  migration from Glacier to another service.  This information is also
  available in the doc/dest.conf.examples/dest.conf.glac file from the
  Download section of the HashBackup site (expand the tar file).

- NOTE: this rev will do an automatic database upgrade to dbrev 17
  when any HB command is used. This upgrade modifies a database index
  and doesn't take too long.

  Because there have been several recent DB changes, HB may do
  consecutive database upgrades if your version is before #1405.
  Don't worry, that was designed in from the beginning and isn't
  anything new.  It looks like this (versions command):

    Current database rev: 14
    Upgrading database to rev: 16
    Copying /testhb/hb.db to /testhb/hb.db.orig before upgrade
    Copying /testhb/dest.db to /testhb/dest.db.orig before upgrade
      Upgrade to rev 15...
    Alter database
    Rename hb.NNNN programs to hb#NNNN in backup directory
    Remove dedup table; backup will rebuild it
      Upgrade to rev 16...
    Alter database
    Database upgraded to rev 16
    Showing recent versions
      0 501(jim) 2016-01-01 13:49:58 - 2016-01-01 13:49:58 #1335 

  In the unlikely event of a problem with the upgrade, the original
  database is restored.  That looks like this:

    Unable to upgrade your database to rev 16
    Restored database
    Restored dest.db
    See traceback below or in stderr redirected file
    Traceback (most recent call last):
      File "", line 208, in <module>
      File "", line 144, in main
      File "", line 173, in opendb
      File "", line 434, in upgradedb
    Exception: some error message

- backup: incremental backups spend most of the time scanning the
  filesystem for modified files.  This scan is now 5-10% more
  efficient.  Because the scan is mostly IO-bound and very "seeky",
  this could show up as either lower CPU usage or faster backups,
  especially on large backups with a lot of history and directories.

- backup: real and simulated raw device and VM image (.vmdk) backups
  are about 10% faster

- a new S3 dest.conf keyword, "class", can be set to either standard
  or ia.  When set to ia, backup files will be stored in Amazon S3's
  Infrequent Access storage class *IF* it will be cheaper than the
  standard S3 storage class.  For small files and files that are
  expected to be stored less than 13 days, standard storage turns out
  to be cheaper than IA because:

  -- IA charges for 128K if files are smaller than 128K
  -- IA charges for 30 days if files are deleted before 30 days

  Amazon also charges 1 cent/GB extra to download from IA.

  Amazon S3 Infrequent Access (1.2 cents/GB/mo), Google Storage
  Nearline (1 cent/GB/mo), and Backblaze B2 (.5 cents/GB/mo) are all
  good options for migrating off Amazon Glacier (.7 cents/GB/mo),
  since HashBackup support for Glacier will be ending in mid-2016.

  The class keyword only works with type s3 destinations.  Other
  S3-compatible destinations like gs (Google Storage) set the storage
  class on the bucket rather than on individual files, so use their
  website to set the bucket storage class.

- S3: Ben Emmons reported a sporadic problem with broken pipe errors
  on S3 for buckets in any region other than us-east-1 (aka US).  Ben
  did some research to explain the cause: HB was sending all requests
  through the US standard region, which is not recommended.  Now the
  location keyword is used to communicate directly with the correct
  region.  If location is anything other than US or us-east-1, it must
  match the region the bucket was created in.  Thanks Ben!
- selftest: with --inc (incremental selftest), small backups were
  tested too often.  Selftest always wanted to test at least 1 arc
  file every run, so if the backup had only 1 arc file, that file
  would be tested every time, even if the goal was 1d/30d ("selftest
  runs every day, verify files every 30 days") Now selftest will test
  the one file (or small backup) once every 30 days.

- backup: during a backup, HB splits a file into blocks, hashes each
  block with the SHA1 cryptographic hash, and uses these hashes to
  find duplicate data.  The block hash is verified during restores to
  ensure that each block's data has remained the same through all of
  its travels through HB itself and to and from remote destinations.

  As an extra safeguard, HB also stores one hash for each file backed
  up.  This ensures that after a restore, the correct blocks were
  restored, in the correct order, and provides extra reassurance that
  no hash collisions occurred during dedup.  (Hash collisions are
  nearly impossible with SHA1, but they still get a lot of attention.)

  Very early on, the SHA256 hash was chosen for the whole file hash.
  In hindsight, this was a poor decision, because it is the slowest of
  all the SHA family of hashes - even slower than SHA512 - and maxes
  out at around 100MB/s on common computers.

  Going forward, the SHA1 hash will be used for the whole file hash.
  SHA1 is about 3x faster than SHA256 and still provides the extra
  layer of error checking to ensure that restored files are identical
  to when they were saved.  All HB commands have been modified to
  handle both the old and new file hashes.

- get: if cache-size-limit is set >= 0, it means some backup data is
  not stored locally.  So get (and selftest -v5) create restore plans
  to figure out the best way to retrieve arc files to use the least
  amount of disk space and not download any arc file more than once.
  But if a raw device was restored, the plan would say "1 item, 0
  bytes", and then during the restore, would fetch the arc files one
  by one as needed, and not delete any until after the restore
  finished.  In other words, there was no plan.  This has been fixed.
  A typical plan will look like this:

    Planning cache...done
      Archives: 4
      Blocks: 46
      Download size: 44 MB
      Peak cache size: 11 MB
      Disk free space: 75 GB, 30%
      Items: 1
      Data bytes: 52 MB

  The important part is Peak cache size, since it tells how much disk
  space will be needed in the backup directory to do the restore.

- Backblaze B2 improvements:
  * HB did not handle bucket names with upper case letters
  * HB would sometimes get http 401 status errors because of a B2 bug
  * If HB cannot create a bucket, it displays the reason why, from B2
  * In general, for most errors HB displays a better message from B2
  * If a bucket ID is mistakenly used in dest.conf, it works now
  * Documentation has been improved a bit
  Thanks Thorsten for pointing these out!

- get: when cache-size-limit is >= 0, get was not respecting the cache
  size limit.  A restore of a 500GB VM image said it would download
  230GB and need 14GB of space in the backup directory.  But if the
  data could be loaded from the destination faster than the restore,
  the backup directory would go over the 14GB limit and cause a disk
  full error.

- backup: in certain circumstances, a file could be skipped instead of
  backed up, but the next backup would catch it

- get: a null pathname on the command line ('') caused a traceback

- mount: if the FUSE library is not installed, mount raised an
  EnvironmentError, which is not so user friendly.  Now, mount
  displays 5-10 lines of information about what FUSE is, where it's
  located, and tips on how it can be installed for the mount command

- selftest: added a new test for -v2 and above

#1405 Dec 20, 2015 - expires Apr 15, 2016


* expiration date bumped to April 15th
* database upgrade to dbrev 15
* backup: dedup uses less memory & is faster
* backup: fast restarts for --maxtime
* backup: simulated backups 25% faster
* rm: new pack config options
* using pack-percent-free to control download costs
* backup: default arc-size-limit is 100MB vs 1GB
* selftest: add size limit to --inc
* add -v2 to compare command
* HB reports exceptions via email
* get: raw device restore bug fixes
* misc minor changes


- NOTE: this rev will do an automatic database upgrade to dbrev 15
  when any HB command is used. This upgrade:
  - modifies a database index
  - renames backup directory programs hb.NNNN to hb#NNNN
  - removes "hb" (very old program version) from backup directory
  - deletes the dedup table; next backup will re-create it

- backup: uses half as much RAM to dedup the same number of blocks.
  The first backup with this version will rebuild the dedup table, so
  may take longer.

- backup: single-core variable block dedup is 5% faster (-D -p0 no B).
  Initial multi-core disk image backups (.vmdk, raw, etc) are 12% faster.

- backup: --maxtime and --maxwait were added in #862 to control the
  backup time for huge backups.  See the #862 changelog for details.
  The initial backup for a multi-TB filesystem with millions of files
  could require several days.  Using --maxtime 6h lets HB backup for 6
  hours every night until all data is saved.  Then incrementals tend
  to be much faster and have no trouble finishing within the backup

  The enhancement in this version is that restarts after the time
  limit are much faster, allowing HB to completely skip huge portions
  of the filesystem already backed up.  This only occurs when
  --maxtime is used, although if you want this shortcut restart
  behavior all the time, use --maxtime 1y.

- backup: simulated backups are up to 25% faster: 155 seconds now vs
  210 seconds with #1371 on a 4.5GB VM image (.vmdk files)

- rm: two new config options control archive packing:

  pack-age-days: specifies the minimum archive age in days.

  Many storage services are adding delete penalties for files removed
  before N days.  For Google Nearline and Amazon Infrequent Access
  storage, it is 30 days.  HB needs to be aware of this because
  otherwise, it could pack an archive several times in a month if
  enough data was removed, which would cost more than leaving the data
  alone.  The new HB default for this is 30 days.  To disable this
  option and preserve existing behavior, set it to 0.

  pack-bytes-free: specifies the minimum # of free bytes.

  This option prevents repeated packing of small files.  For example,
  if an arc file is 50K and 25K is deleted, it would be packed if
  pack-percent-free is 50 (the default).  But, it's not worth the
  trouble for such a small savings. The default for this option is
  1MB.  To preserve existing behavior, set it to 4K, the minimum.

- IMPORTANT NOTE for the pack-percent-free config option: most storage
  services are charging high download rates compared to storage rates.
  For example, it costs 8x more to download a file from S3 Infrequent
  Access (10 cents/GB) than to store the file for a month (1.25
  cents/GB).  Stated another way, it costs the same to store a file
  for 8 months or download it just once.  If pack-remote-archives is
  set to True (the default is False), and cache-size-limit is >= 0
  (not all archives are stored locally), consider bumping
  pack-percent-free much higher than 50 to limit packing downloads.

  If cache-size-limit is -1, meaning a local copy is kept of all
  archives files, packing does not require a download so this config
  option is not as important for cost control.

-- backup: the default arc-size-limit for new backups is now 100MB
   instead of 1GB.  If you want to change the arc size for existing
   backups, use: hb config -c backupdir arc-size-limit 100mb.  This
   will increase the number of arc files used, but there are very
   large sites running in this configuration with 40-50K arc files,
   without problems.  Smaller archives make it more likely that HB can
   manage remote storage with delete commands instead of downloading,
   packing, and uploading.

- selftest: the --inc freq/goal option is used to do incremental
  selftests, where a portion of the backup is checked every day.  The
  freq and goal specify the percentage of the backup space to be
  checked (freq/goal).

  Many storage services have a free allowance for downloads, for
  example, Backblaze B2 allows 1GB/day, and charges after that.  To
  ensure incremental selftest doesn't go over the free allowance, a
  new limit option can be added. For example, --inc 1d/30d,500m means:

  * selftest is run every day by cron (or manually, etc)
  * check the whole backup every 30 days
  * the percentage is therefore 1/30, or 3.3% each run
  * but, don't check more than 500MB of archive space in one run

  Selftest may still go over the limit if a single arc file is bigger
  than the limit.  When the limit is triggered, selftest will display
  a message.  In this case, your goal cannot be met, so a selftest of
  the complete backup will take longer than your specified goal.

- the compare command compares a backup with a live filesystem and
  indicates new, changed, and deleted files.  For changed files, the
  compare command shows which attributes changed.  The new -v2 option
  shows the backup and filesystem values for each changed attribute.

- HB reports all unhandled exceptions (tracebacks) via email.  This
  email includes the HB version number, command line, traceback, and a
  short system description (Linux, Mac OSX, or BSD)

- get: raw device restore fixed

- get: show progress for large files only if displaying output

- selftest: before, would go one arc file over the limit with --inc
  instead of staying under the limit.  For GB-sized arc files, it
  makes a difference.

- backup: a simulated backup could end with a traceback:
    AttributeError: Arc instance has no attribute 'iobuf'

- selftest: an incorrect error was displayed:
    Error: for logid 1786855, hlogid 1786854 is invalid: /bin/ln [r836]
  This was a bug in selftest, not a problem with the database.

- backup: OSX system files were sometimes incorrectly tagged sparse

- ls: add a note for sparse and raw (device) files with the -l option

- backup: gave a warning about slow backups & restores for -Z5 and
  higher, but should have only been for -Z8 and higher.

#1371 Nov 28, 2015 - expires Jan 15, 2015


* rate keyword is supported on Backblaze B2 and WebDAV
* new WebDAV keyword: subdir
* better WebDAV documentation


- the rate keyword in dest.conf is now supported for Backblaze B2 and
  WebDAV destinations.  See doc/dest.conf.examples/README for more
  details about the rate keyword.  Basically it is an upload rate
  limit in bytes per second and allows suffixes like 512k to mean
  512K (512 x 1024 = 524288) bytes per second.

- "subdir" is a new keyword added to WebDAV destinations.  This allows
  storing multiple backups in the same WebDAV area.  It was possible
  to do this before if the subdirectories were created before using
  HB.  Using the subdir keyword, HB will create these directories.

- the example WebDAV file, doc/dest.conf.examples/dest.conf.dav, has
  more explanations about how to use HB with WebDAV.  WebDAV servers
  are often configured differently and can be picky about their setup.

#1365 Nov 26, 2015 - expires Jan 15, 2015


* backup bug fix: hang


- backup: a bug was introduced in the Nov 22 release, #1363, that
  caused backup to hang after creating only a few arc files.  This bug
  was not related to a particular destination type.

#1364 Nov 25, 2015 - expires Jan 15, 2015


* Glacier bug fix


- in #1363, a change was made that caused Glacier destinations to fail
  with this traceback:

  dest glac: [AttributeError] 'dict' object has no attribute 'type'

  Traceback (most recent call last):
    File "/", line 104, in <module>
    File "/", line 2047, in main
    File "/", line 302, in init
    File "/", line 248, in initdest
    File "/", line 177, in startdest
    File "/", line 229, in __init__
    File "/", line 123, in __init__
    File "/", line 90, in baseinit

#1363 Nov 22, 2015 - expires Jan 15, 2015


* add support for Backblaze B2
* enable SSL certificate verification for secure WebDAV
* ls and get permissions check bypassed for key.conf owner
* bug fixes


- Backblaze B2 is supported.  B2 is still in the invite-only beta
  stage, so please observe Backblaze beta guidelines.  See
  doc/dest.conf.examples/dest.conf.b2 for B2 dest.conf keywords.

- if the "secure" keyword is used with a WebDAV destination like, the SSL certificate received from the server is verified

- the HB ls and get commands will not check file permissions stored in
  the backup if the user running HB is the owner of key.conf.

  For example, if root does nightly backups and a user is given
  sufficient OS permissions to access the backup files, hb get checks
  permissions saved with backed-up files to see if the user has read
  access to the files being restored.  If not, hb get issues a "No
  read permission" error and will not do the restore.

  But in a disaster recover situation with shared hosting, userids may
  change, the userid used for backup and/or owning the HB backup files
  may not be the same as the userid doing the restores, and the person
  running HB may not have any control over userids in a shared hosting
  situation.  Now, if the userid running hb is the owner of key.conf,
  get and ls will proceed.

- dest clear: if there were files flagged for removal, dest clear
  could complain about them being the only copy and refuse to delete

#1340 Stable - Oct 22, 2015 - expires Jan 15, 2015


* backup skips empty directories on the command line
* bug fixes


- backup: backup now skips empty directories listed on the command
  line.  This is useful when backing up a mounted file system, eg NFS,
  that isn't currently mounted.  Previously backup would mark all
  files as deleted if the file system wasn't mounted.

- retain: -m used by itself caused a traceback.  Now, a message is
  displayed that the -x option is required if -s and -t are omitted.

- rm: if the highest backup version is removed with -rN, it could
  create an confusing situation for files that were previously backed
  up, deleted in rev N, then backed up again, for example:

    $ hb ls -c hb -a
    HashBackup build #1335 Copyright 2009-2015 HashBackup, LLC
    Backup directory: xxx
    Most recent backup version: 1
    Showing all versions
      0 /  (parent, partial)
      0 /Users  (parent, partial)
      0 /Users/jon
      0 /Users/jon/x  (deleted in version 1)
      1 /Users/jon/x

  Now when the highest version is removed, any files marked deleted in
  that version will be undeleted, as if the deleted backup never

- selftest: two new tests for the confusing rm situation above

#1335 Stable - Oct 18, 2015 - expires Jan 15, 2015


* release schedule change
* expiration date bumped to Jan 15th
* bug fixes


- to reflect its more stable status and to avoid a release update
  during December holidays, HashBackup's release schedule is changing.
  The new expiration schedule for the backup feature is:

     January 15th -- April 15th -- July 15th -- October 15th

- a user reported this traceback.  The cause is now fixed.  If you
  have this problem with a backup, delete the hb.sig file.

    Traceback (most recent call last):
      File "/", line 104, in <module>
      File "/", line 2191, in main
      File "/", line 371, in put
      File "/", line 833, in genincdb
    ZeroDivisionError: integer division or modulo by zero

- in unusual circumstances, HB may create a partial backup of a file,
  for example, when there is an I/O error backing it up, or when
  selftest has to truncate a file because it detects an error.  These
  partial files were not handled correctly by retain, sometimes
  causing good files to be removed while the partial file was
  retained.  Now, retain will delete partially backed up files if
  there is a later, complete backup of the file.

#1332 - Aug 11, 2015 - beta expires Dec 15, 2015


* expiration date bumped to Dec 15th

NOTE: Release #1330 has proven to be very stable over the summer.

The next release of HashBackup will have many changes.  Rather than
releasing these changes now, forcing everyone to accept all of them
before #1330 expires, the next major release is being held back until
after Sep 15th.

If you prefer to have a very stable version, update to #1332 before
Sep 15th.  The only change is a bump in the expiration date.  Update
after Sep 15th to get the latest features of the new release.

#1330 - May 19, 2015 - beta expires Sep 15, 2015


* expiration date bumped to Sep 15th
* get: fix progress percentage
* eof when answering yes/no questions
* cache-size-limit honored when syncing new destination
* fix unhelpful dest.conf error message
* imap destination supports timeouts


- get: during restore of a compressed file, the progress display did
  not go to 100%

- if an EOF occurs when hb asks a yes/no question, hb aborts instead
  of asking the question again.  This was a problem if a keyboard is
  unavailable, eg, hb running in the background.

- cache-size-limit (config keyword) is honored when syncing files to a
  new destination, to avoid filling the local backup directory.

- fixed unhelpful error message if an integer is expected for a
  keyword in dest.conf and something else is used

- imap destinations previously did not support the timeout keyword,
  and could hang for long periods of time.  Now, the default timeout
  is 5 minutes and can be changed with the timeout keyword in
  dest.conf.  If a timeout occurs, hb will use its normal retry loop
  to recover.

#1321 - Apr 20, 2015 - beta expires Jun 15, 2015


* selftest: logid traceback fix
* backup/selftest: hash datatype fix for #1316
* selftest: partial selftest fix


- selftest: a recent change verifies that hashes have the correct
  datatype in the hb database, but also could cause this error:

    Traceback (most recent call last):
      File "/", line 178, in <module>
      File "/", line 1166, in main
    UnboundLocalError: local variable 'logid' referenced before assignment

- backup: in #1316, variable-block dedup backups (-D without -B)
  created the wrong datatype for hashes, causing the previous error.
  This is now corrected, but you should run selftest -v2 --fix to
  check your backup.  It may display errors like this:

    Checking blocks I
    Error: block 256, hash is type str
    Corrected hash type
    Checked  256 blocks I

  Selftest -v2 (without --fix) may also display errors like this:

    Error: dedup blockid 3 hash mismatch (not critical)

  Running selftest -v2 --fix corrects these.

- selftest: a recent change enabled selftest to pack arc files, but a
  partial selftest could cause this traceback:

    Traceback (most recent call last):
      File "/", line 178, in <module>
      File "/", line 1198, in main
    ValueError: need more than 2 values to unpack

#1316 - Apr 12, 2015 - beta expires Jun 15, 2015


* April is HB test month
* selftest: -v4 --fix is now safe to use
* selftest: -v4 packs arc files
* selftest: suppress arc size 0 error
* selftest: don't require a 2nd -v2 pass
* backup: fix simulated backup error on empty backup
* get: fix hash mismatches on sparse files
* increase database timeout from 15 to 300 seconds
* get: fix "Unhandled exception" random error
* backup: large dedup table resize could fail


- for all of April, the focus will be on testing, test scripts, and
  bug fixes.  After April, two days per week will be devoted to
  testing to ensure HB's quality remains high.

- selftest: -v4 --fix could cause arc size errors if it was
  interrupted.  This is fixed and -v4 --fix is okay to use now.  If
  anyone had this problem, selftest --fix will now correct it.

- selftest: with -v4, if archives have to be downloaded to be checked,
  they will also be packed and uploaded if their free space is greater
  than pack-free-percent.  This happens even if pack-remote-archives
  is False, because during selftest, the arc files are already local.
  To completely eliminate packing (probably not a good idea) set
  pack-free-percent to 100.  Then empty arc files are deleted but
  none are packed.

- selftest: if destinations were not in sync, selftest might display
  this message that isn't actually an error:

    Error: arc.637.0 wrong size on atmos destination: should be 134223056, is 0

- selftest: if -v4 --fix encountered a bad block, deleted it, and the
  file containing it was the last file in an archive, selftest -v2
  would need to be run to correct this residual error:

    Error: arc.0.0 is not referenced
    Marked for deletion

  Now this is handled with one selftest.

- backup: with a simulated backup, if no backup data was generated
  because a zero-length file was saved or no files were modified, an
  error could occur after the backup when the empty arc file was

- get: sometimes failed with these errors, often with .dmg or other
  disk image / large files with repeated blocks.  It worked with -p0.

    Block hash mismatch, blockid 1727226: pathname
    File size mismatch, should be 17896386, is 14750658: pathname

- the database lock problem seems to be solved by increasing the
  database timeout.  The problem was random, but occurred more often
  on systems with:

  - a large buffer cache (lots of RAM)
  - disk I/O restrictions (VMs and shared hosts)
  - systems with lots of write activity
  - HB running under nice and/or ionice
  - XFS: it flushes every 30 seconds while ext4 flushes every 5, so
    more "dirty" buffers accumulate in the buffer cache with XFS

  These all contribute to long sync times.  HB databases use
  synchronous I/O for fault tolerance, and database commits require
  flushing the system buffer cache.  On some systems, this flush could
  take longer than 15 seconds - the previous database timeout.

- get: sometimes get would end with an error message even though the
  file restore was successful:

    Unhandled exception in thread started by <bound method dir.loop of <dirdest.dir instance at 0x100f26fc8>>
    Traceback (most recent call last):
      File "/", line 331, in loop

- backup: if the dedup table is >= 2GB, the next resize to 4+GB
  could fail.  On Linux, the error was:
    Error writing dedup table: wrote 2147479552 of 4147483640 bytes
  On OSX, the error was:
    OSError: [Errno 22] Invalid argument

#1303 - Mar 18, 2015 - beta expires Jun 15, 2015


* misc fixes


- stats: number of paths deleted was wrong

- backup: using similar directory names on the backup command line
  could cause errors:

    $ hb backup -c hb mnt/dir-2/db mnt/dir/backups
    HashBackup build #1298 Copyright 2009-2015 HashBackup, LLC
    Backup directory: /Users/jim/hb
    This is backup version: 0
    Dedup is not enabled
    Unable to stat file: No such file or directory: /Users/jim/mnt/dir/
    Unable to backup: No such file or directory: /Users/jim/mnt/dir/backups

#1298 - Mar 16, 2015 - beta expires Jun 15, 2015


* selftest: arc file healing bug
* clear deletes the log directory too


- selftest: arc file healing is where an arc file has a bad block and
  is reconstructed from good blocks found in another copy of the arc
  file.  This release corrects a bug found in internal testing.

#1293 - Mar 14, 2015 - beta expires Jun 15, 2015


* Python update
* S3/Glacier fixes
* default workers changed to 2 (was 3)
* clear preserves config by default, gets --reset option


- this version of HB is built with a later version of Python

- in #1256, a default timeout of 30 seconds was added for Amazon S3,
  because of long hangs at a (Linux) customer site.  But this change
  caused failures on Mac OSX:

    dest s3: error #1 of 3 in send arc.1.0: [error] [Errno 35] Resource
    temporarily unavailable

  The new version of Python fixes this problem.

- S3 and Glacier should work better in this release:
  - the Python upgrade fixes some communication problems
  - retries do not cause such long delays
  - both S3 and Glacier have 30-second I/O timeouts

- Glacier: in #1288, a software library HB uses could cause a
  traceback with large files:

    dest glacier: error #1 of 3 in send arc.1001.5:
    [UnicodeDecodeError] 'ascii' codec can't decode byte 0xd0 in
    position 21: ordinal not in range(128)

- the default number of workers for destinations is 2 instead of 3

- previously, the clear command reset all config keywords.  Now it
  preserves the config settings unless --reset is used.

#1288 - Mar 10, 2015 - beta expires Jun 15, 2015


* retain can operate on specific files & directories
* rm/retain: can pack simulated backups
* selftest: -v2 can be used with simulated backups
* fix hash should not be null selftest errors
* misc fixes


- retain previously operated on the entire backup.  Now, pathnames can
  be used with retain so that it operates on just these files or
  directories.  This is useful when you want to retain fewer copies of
  certain parts of the backup.

  For example, if you backup /home with several users, each user could
  have a different retention policy by running retain several times
  with /home/user1, then /home/user2, etc.

  As another example, you may have a log directory where you only want
  to keep the last 30 days, even though for the rest of the backup,
  you want to keep 1 year of backups.

  In addition to the specific retains, it's a good idea to continue
  running retain on the entire backup, without a pathname, unless you
  are sure that the specific retains will cover all parts of the

- rm/retain: a simulated backup (config keyword simulated-backup =
  True) is used to model a backup with live data, to learn by
  experiment the best backup method and HB options to use, without
  requiring a lot of disk space.  One key option is packing archive
  files.  Packing always happens with local archive files when the
  free space exceeds the pack-percent-free config keyword.  It happens
  with remote archives only if pack-remote-archives is True.

  If pack-remote-archives was set on a simulated backup, it caused:
    Error opening archive: Archive does not exist: /hbdev/hb/arc.0.0

  Since it is important to be able to model remote archive packing, to
  see how it affects backup space utilization, this is now supported
  with simulated backups.  When the config keyword
  pack-remote-archives is True (default is False), rm and retain will
  pack the simulated arc files and hb stats will show this.

- selftest: -v2 can now be used with simulated backups.  Any higher
  level will cause an error and stop.

- rm/retain: when hard links were removed, it sometimes caused a
  selftest error "hash should not be null" on other files that were
  hard-linked to the removed file.  This release fixes that bug.  To
  correct existing "hash should not be null" errors in your backup,
  run selftest -v2 --fix until there are no errors.

- selftest: when a hard-linked file is truncated, mark all related
  hard links as partial so they are saved on the next backup

#1284 - Mar 4, 2015 - beta expires Jun 15, 2015


* improve speed of fifo backups on Linux
* Glacier bug fix


- backup: piped backups (named pipes / fifos) are faster on Linux

- Glacier: a recent upgrade to a new version of the Amazon library
  supporting Glacier could cause a traceback:
    IOError: [Errno 2] No such file or directory: 'endpoints.json'

#1275 - Mar 2, 2015 - beta expires Jun 15, 2015


* selftest: correct "hash should not be null" errors
* log errors: display _RUN logs
* selftest: don't print progress if sending output to a file
* backup: backup from named pipes (fifo) on the command line
* Glacier: always tried last for retrieval


- selftest: there is a bug in HB's hard link handling that can cause
  "hash should not be null" errors.  The cause of the problem has not
  yet been fixed, but now selftest -v2 --fix will truncate the file to
  eliminate the errors.  Then backup will save the file again.  Still
  working on finding & fixing the hard link bug.

- log errors: logs ending in _RUN are now included in the error
  summary.  Sometimes if HB can't get started properly, empty _RUN
  logs are created.  Lots of these is a sign that "something bad" is

- selftest: in a recent change, selftest prints progress percentages
  in some long sections.  Don't print these if output is being sent
  to a file.

- backup: if a named pipe, aka fifo, is listed on the command line,
  the fifo is opened and all data is saved.  This can be used for
  example with a database dumper, to back up a database dump without
  having to create a huge dump file first.  Here is a simple example:

    mkfifo fi
    cat somefile > fi & hb backup -c backupdir fi

  Instead of using cat, any program can be used, and its output will
  be backed up.  It is not possible to put hb in a simple pipeline
  without a fifo, because it would not have a filename to associate
  with the data saved, so this does not work:
    cat somefile | hb backup -c backupdir      Nope!

  Backing up from named pipes is about half the speed of regular
  files, because pipes usually have small buffers.

  IMPORTANT: it is easy to have multiple processes writing to a fifo
  at the same time by mistake (I did it during testing.)  When that
  happens, the fifo is getting data from two places and the backup is
  a mixture of the two.  Or, said another way, it is trash.

- Glacier: when downloading a file, HB tried destinations in the order
  they are listed in dest.conf.  Now, Glacier is always a last resort
  even if it is listed first in dest.conf, to avoid 4-hour restore
  delays and potentially high retrieval charges.

#1269 - Feb 27, 2015 - beta expires Jun 1#1269 - Feb 27, 2015 - beta expires Jun 15, 2015


* selftest bug fix
* s3 hang fix
* more fun with Database is locked


- selftest: with non-incremental selftest, this traceback occurred:
    Traceback (most recent call last):
      File "/", line 178, in <module>
      File "/", line 1612, in main
    UnboundLocalError: local variable 'maxvernum' referenced before assignment

- s3 destination was causing hb to hang at the end because of a close
  added to help with the database lock problem

- yes, the dreaded (and random-ish) "Database is locked" error is
  still lurking.  Since it is not easily reproduced except at customer
  sites, here is another change that may help.  Or not.

#1266 - Feb 26, 2015 - beta expires Jun 15, 2015


* selftest: new --inc option (incremental selftest)
* default selftest level is -v2
* selftest: doesn't ask "Continue?" for -v5
* bug fixes


- selftest: a new option, --inc, can be used to do more thorough
  testing of a backup over a period of time.  For example, there are
  hb backups with over 30K arc files, none stored locally, so it's not
  reasonable to do a full selftest all at once.  Even with smaller
  backups, there may not be enough time in the backup window to do a
  full selftest.

  The --inc option is: --inc f/g, where f is the frequency selftest is
  typically run, and g is the goal for completing the selftest.  For
  example, --inc 1d/30d means selftest is run every day and a complete
  selftest should occur every 30 days.  Or, --inc 1h/1q means a
  selftest runs every hour, and a complete selftest should occur every
  quarter (90 days). 

  NOTE: --inc does not cause a selftest to be run on a schedule.  Use
  cron or some other scheduling tool to initiate selftest.

  --inc can be used with -v3, -v4, or -v5, and there is a separate
    checkpoint for each level.  This allows incremental local arc file
    tests (-v3), local + remote arc tests (-v4), or file hash
    verification (-v5) to have different schedules.

  The --inc option can be combined with -r to incrementally check a

  selftest --inc reset clears all checkpoints.

- selftest: previously, the default selftest level could be almost
  anything, depending on whether arc files are stored locally and
  other factors.  Now, the default level is -v2, or -v1 for simulated
  backups.  A note is displayed that higher levels will do more
  thorough checking.  Higher levels of checking may involve
  downloading lots of arc files, so it seems reasonable to request
  that rather than having it be a default action.

  For thorough backup checking, the recommended level for selftest is
  -v4, to check all arc file data.  -v5 can be useful, to verify user
  file sha256 hashes, but it does not test every arc file like -v4.

- selftest: in #1263, -rN -v4 failed with:

    Traceback (most recent call last):
      File "/", line 178, in <module>
      File "/", line 897, in main
    ValueError: need more than 0 values to unpack

#1263 - Feb 24, 2015 - beta expires Jun 15, 2015


* selftest can handle missing files
* bug fixes


- selftest: if an expected file was missing at a destination, selftest
  would fail or hang indefinitely waiting for it to be downloaded.
  Now, the missing file will generate errors for each block, but if
  there are other copies of the arc file at other destinations,
  selftest will upload a copy to the destination where it was missing.

- selftest: related to above, if the local copy of an arc file is
  missing, and it should be present because cache-size-limit is -1,
  selftest will leave a copy in the backup directory if there are
  destinations that have the file.  If an arc file is truly missing
  and not present locally or at any destination, all of its blocks are
  deleted and the files using those blocks are truncated when --fix is

- selftest: with the recent selftest changes, a race condition could
  cause this error:

    Checking blocks I
    Downloading arc.0.0
    Checking arc.0.0
    Exception in thread _writethread: [Errno 9] Bad file descriptor
      File "/", line 68, in start_thread
      File "/", line 65, in _writethread

- selftest: don't print 'Downloading arc.v.n' when there are no remote
  copies, and show the list of destinations when remote copies are

- selftest: add a note to run selftest -v2 --fix if there were

#1256 - Feb 23, 2015 - beta expires Jun 15, 2015


* a new command, log, runs another HB command with logging
  This command is experimental; feedback is welcome.
* selftest: levels -v3 & -v4 check remote arc files
* selftest: new -r (version) option
* selftest: progress display
* selftest: option to test specific arc files or pathnames
* selftest: healing arc files using multiple destinations
* debug keyword for Amazon S3, default S3 timeout
* bug fixes


- a new command, log, makes it easier to run HB commands with a log
  file.  Any HB command can be run with logging.  The usage is:

    # hb log backup -c backupdir -D1g -B1m /Users/jim /etc

  Example in a cron file:

    @hourly hb log backup -c xyz -D1g /; hb log retain -c xyz -s30d12m
    @weekly hb log errors -c xyz

  The hb log errors command displays log files from all commands that
  failed, zips all files to a monthly .zip file, and prints a summary
  of successful and failed commands.  This info is written to stderr
  and the exit code is always 1, so putting hb log errors in a cron
  file will cause the output to be emailed.

  When run from a regular terminal, command output is sent to the
  terminal and to the log file.  When run from a cron job, or anytime
  stdout is redirected, output is only sent to the log file.  Regular
  output (stdout) and error output (stderr) are interleaved, and every
  line is timestamped in the log file:

    2015-02-17 Tue 21:48:16| HashBackup build #1236 Copyright 2009-2015 HashBackup, LLC
    2015-02-17 Tue 21:48:16| Backup directory: /Users/jim/hb
    2015-02-17 Tue 21:48:16| Copied HB program to /Users/jim/hb/hb.1236
    2015-02-17 Tue 21:48:16| This is backup version: 0
    2015-02-17 Tue 21:48:16| Dedup is not enabled
    2015-02-17 Tue 21:48:16| /Users/jim/bcopy
    2015-02-17 Tue 21:48:16| Writing archive 0.0
    2015-02-17 Tue 21:48:16|
    2015-02-17 Tue 21:48:16| Time: 0.5s
    2015-02-17 Tue 21:48:16| Checked: 4 paths, 5542 bytes, 5.5 KB
    2015-02-17 Tue 21:48:16| Saved: 4 paths, 5542 bytes, 5.5 KB
    2015-02-17 Tue 21:48:16| Excluded: 0
    2015-02-17 Tue 21:48:16| Dupbytes: 0
    2015-02-17 Tue 21:48:16| Compression: 91%, 11.2:1
    2015-02-17 Tue 21:48:16| Space: 496 B, 139 KB total
    2015-02-17 Tue 21:48:16| No errors
    2015-02-17 Tue 21:48:16| Exit 0: Success

  Logs are written to the logs directory under the backup directory.
  The log filename is a timestamp and the command name.  While a
  command is running, the log filename will have RUN at the end.  If a
  command is suddenly aborted (kill -9), the RUN suffix will stay
  there.  If the command fails, RUN is changed to FAIL.  If the
  command succeeds, there is no status at the end.

- selftest has these -v options that have not changed:
    -v0: check database is readable
    -v1: check database integrity
    -v2: check database high-level structure
    -v5: check user file hashes (sha256)
    (No -v option means hb selects level based on several factors)

  The -v3 and -v4 options have changed:
    -v3: check all local archives
    -v4: check all local and all remote archives

  With -v3 or below, no data is downloaded from remote destinations.

  With -v4, arc files are downloaded from ALL destinations now,
  whereas previously, HB downloaded each arc file from only 1
  destination - usually the first destination listed - and then, only
  if necessary.  If you ran with cache-size-limit set to -1, selftest
  did not verify the remote arc files.

  The result is, -v4 is much more thorough than it used to be, and may
  take longer to run if you have multiple destinations.

  Another improvement is that -v4 uses local disk space equal to the
  number of destinations times one arc file size, so it can be run on
  huge backups that previously would have required a lot of local disk
  space.  A cache plan is no longer needed for -v4, but it is still
  used with -v5.

  IMPORTANT NOTE: selftest can't download and check arc files stored
  at Amazon Glacier.  The 4-hour retrieval delay is unmanageable for

- selftest has a new option, -r, to indicate the version you want to
  test.  Selftest always runs all of the usual database checks.  With
  -r, the -v3 and -v4 options only test the arc files in that version.
  For -v5, the -r option restricts full file verification to the user
  files in that version.

  The default is to test all versions, as before.  To test the most
  recent version, use -r-1.

- selftest has a progress display in the block & ref tests, since
  these can be quite long and may appear "stuck"

- selftest can test specific arc files or pathnames by listing them on
  the command line.  If arc files are listed, all copies are tested as
  if -v4 was used.  If pathnames are listed, their sha256 file hash is
  verified as if -v5 was used.  If any pathname is a directory, all
  files underneath the directory are tested also checked.  For listed
  pathnames, all versions are tested.  A warning is printed if
  specific arc names are combined with -v4, or pathnames are combined
  with -v5; neither of these make sense: -v4 tests *all* arc files and
  -v5 tests *all* pathnames.

- selftest: with -v4 and multiple destinations, arc files can be
  corrected.  All good blocks are merged into a new arc file that
  replaces any arc files with problems.  selftest does not yet handle
  completely missing remote files and will usually halt or hang.

- s3: if the debug keyword is added to an S3 destination, a log file
  destname.log is created in the backup directory.  Add debug 99 to
  dest.conf.  This will also usually cause HB to fail when exceptions
  occur rather than catching them and doing retries.

- s3: a 30-second timeout has been added to help prevent hangs on

- recover: failed with a traceback:
    Traceback (most recent call last):
      File "/", line 156, in <module>
      File "/", line 191, in main
      File "/", line 143, in init
    AttributeError: 'NoneType' object has no attribute 'get'
  This was related to adding simulated backups.

- in very unusual circumstances, this error could occur because an arc
  file was not closed:
    Error: unable to get block 716302: Error opening archive: Too many
    open files: (pathname)

- HB would randomly halt with a Database is locked problem.  This
  update includes a fix for one situation where this could occur.

#1235 - Feb 16, 2015 - beta expires Jun 15, 2015


* debug 99 keyword on a destination will display a detailed traceback
  if an error occurs during startup
* Glacier: fix typo in yesterday's release for this error:
    dest glac: error #1 of 3 in send arc.0.0: [NameError] global name 'location' is not defined

#1231 - Feb 15, 2015 - beta expires Jun 15, 2015


* config: new simulated-backup keyword
* backup: display file being saved on ctrl-c
* backup: save entire file if size increases
* selftest: --fix enhancements
* backup: permissions, filename case, Apples
* backup: new config keyword, backup-linux-attrs
* rm/retain: pre-2013 arc files are now packed
* rm/retain: display the amount of backup space removed
* rm/retain: respect cache-size-limit when packing arcs
* retain: add a default -x if none is specified
  NOTE: this may remove lots of deleted files from your backup that
        should have been removed earlier, but weren't
* export: include dest.db in export.tar
* pre-2014 database upgrade code removed
* bug fixes


- config: a new keyword, simulated-backup, can be set to True before
  the initial backup.  When set True, no arc files are created by the
  backup command.  This allows modeling backup options such as -B
  (blocks size) and -D (dedup table size), even for very large
  backups, without using a lot of disk space.  Simulated backups also
  run faster because there is less I/O.  Incremental backups work
  correctly, and the stats command can be used to view statistics,
  space used, etc.

  Summary of differences for simulated-backup:
  - must be set before the initial backup
  - cannot be changed after the initial backup
  - no arc files are created (not a real backup)
  - no files are sent to remote destinations
  - selftest is limited to -v1
  - get & mount will fail with "No archive" errors

- backup: if interrupted with a ctrl-c, backup will display the file
  it was saving.  This can be useful with -v0/-v1 since file names are
  not printed

- backup: if a file gets bigger during the backup, save whatever data
  is there.  Previously, backup saved the file up to its size at the
  start of the backup, plus sometimes 1 extra block.

- selftest: --fix will truncate files that contain a bad block
- backup: (Mac OSX) HB is a case-sensitive program, so /users is not
  the same as /Users.  Most Unix filesystems are also case-sensitive.
  HFS, the Mac OSX filesystem, is usually case-insensitive, so /users
  is the same as /Users.  If HB did nothing, then backing up /users
  and /Users would result in saving the same directory under 2
  different names.  It's not a problem, but it's confusing.

  To solve this, HB tries to figure out the correct "case" of all
  pathnames on the command line; then everything works correctly.  To
  do this, HB needs read access to the parent directory of all
  pathnames on the command line, so to map /users to /Users, HB needs
  read access to / (root).  Usually this is fine, but on some systems,
  HB may not have read access to a parent directory, and backup would
  fail with a traceback.  Now it will display a new error message
  "Cannot verify filename case", and use the filename as-is.

- backup: a new True/False config keyword, backup-linux-attrs,
  controls whether HB backs up Linux file attributes.  Linux file
  attributes are set with the chattr command and displayed with
  lsattr.  They are little used and poorly implemented on Linux,
  requiring an open file descriptor and an ioctl call.  This can cause
  permission problems, especially in shared hosting environments.  Now
  HB will only read and store file attributes on Linux if
  backup-linux-attrs is set True with the hb config command.  The
  default is False.

  NOTE: File attributes are not the same as extended attributes, also
  called xattrs.  Extended attributes are always backed up if present.
  Xattrs are handled by the attr, setfattr, and setfattr commands on
  Linux, as well as the Linux ACL commands.

- rm/retain: previously, HB would not pack version 0 or 1 arc files,
  created before Sep 10, 2012.  Now it will.  To see if your arc files
  need packing, check these lines in hb stats:
               48 GB archive space
               36 GB active archive bytes - 75.08%
  After packing old archives on this backup, hb stats says:
               39 GB archive space
               36 GB active archive bytes - 93.06%
  If cache-size-limit is -1 (the default), copies of arc files are
  kept locally and packing is enabled.  If cache-size-limit is set,
  not all arc files are kept locally; they are only downloaded, packed,
  and uploaded if the config keyword pack-remote-archives is True.

- rm/retain: when arc files are deleted or compressed, rm & retain
  display the amount of backup space saved.

- rm/retain: when cache-size-limit is set and pack-remote-archives is
  True, rm and retain now try to respect the cache size limit while
  compressing archives.  This keeps HB from using too much local disk
  space.  Packing might run slower if it has to wait for packed
  archives to be transmitted, to respect cache-size-limit.

- retain: without the -x option, files that had been deleted from the
  filesystem were staying in the backup forever.  This was fixed for
  -t (see below) but is still a problem for -s.  Now, a default -x is
  always used.  The default -x is the same as -t or the last time
  period of -s.  So for -s30d12m, the default would be -x12m, ie, keep
  history of deleted files for 12 months after they are deleted.

- export: the dest.db file and a stub dest.conf file with just
  destination names are now included in export.tar to help debug
  problems that involve remote destinations.  dest.db contains HB
  filenames (arc files, etc) and the names of destinations containing
  them.  It does not contain user data, keys, or passwords.

- retain: with -t retention, HB would not remove a deleted file if it
  was the only copy, no matter how old it was.  Now it is removed when
  it has been deleted longer than the -t retention time. 
  - a file is saved January 1st
  - the retention period is -t30: keep 30 days of file history
  - the file is deleted January 31st
  - for 30 days from Jan 31st, the deleted file must be restorable
  - 31 days after it was deleted from the filesystem, it can be
    removed from the backup

- retain: the -x option allows removing history for deleted files
  sooner than it would normally be removed.  For example, with -t30d,
  the history of a deleted file is kept for 30 days after it is
  deleted.  With -t30d -x15d, history of active files is kept for 30
  days, but files that have been deleted (from the filesystem) have
  their history removed 15 days earlier.  That was the design intent. 

  What actually happened (ie, the bug) is that retain with -x15d was
  removing deleted files *saved* more than 15 days ago, so it deleted
  files too soon.  It should have been checking the date the file was
  deleted, not the date it was saved.

- on Linux, when HB asked a yes/no question it sometimes would not
  display the question, yet waited for a response.  Related to a
  recent buffering change for stdout.

- on Glacier, HB was not ignoring "not found" errors when trying to
  remove files, causing these kinds of error messages:

    glac(glac): unable to remove arc.0.0 with archive id (long id)
    Expected 204, got (404, code=ResourceNotFoundException, message=The
    archive ID was not found: (long id)

  This can happen if part of a backup is saved in one region, the
  Location keyword (region) is changed in dest.conf without changing
  destname, then you backup in a different region.

- backup: when a backup has old, inactive destinations that were used
  in the past, HB would sometimes try to remove files from them.  This
  could cause errors like below with active Glacier destinations.  Now
  HB ignores inactive destinations.

    glac(glac): no archive id available to remove this file: arc.0.13

- selftest: don't display errors about files on inactive destinations

#1200 - Feb 3, 2015 - beta expires Jun 15, 2015


* update HB's database software


- #1200 is identical to #1199 except that HB's database software has
   been upgraded to a new release.  Everything should function the
   same, though performance may be slightly different.  In testing, a
   long selftest, which has lots of database operations, took about 7%
   less time than with the previous version.  If performance is much
   worse for any operation, please send an email.

#1199 - Feb 2, 2015 - beta expires Jun 15, 2015


* bump expiration date
* bug fix in rm


- rm: when a hard-linked file is removed, sometimes its data blocks
  were not being deleted.  This could cause a selftest error:

      Error: unknown logid referenced: 5 [r0]
      IndexError: list index out of range

  selftest --fix will correct this, but the actual cause was rm.

#1192 - Jan 16, 2015 - beta expires Mar 15, 2015


* automagically re-creating hb.db
* add hard-link count to ls -l display
* stats enhancements
* bug fixes


- when remote destinations are setup in dest.conf, HB creates
  incremental versions of hb.db (the main HB database) in the local
  backup directory.  These are sent to remote destinations and are
  used by hb recover to re-create hb.db if the local backup directory
  is lost.  In some cases, you may be working directly with a remote
  backup directory.  For example, the remote backup directory might be
  on an external USB drive and you bring it back to the office to do a
  complete restore when a disk dies.

  The "normal" way to do this would be to create a local recovery
  directory, add key.conf and dest.conf, and run recover.  This would
  download (or copy from the backup drive) all of the files from the
  remote backup directory and re-create hb.db.

  Now, you can put a key.conf file in the backup directory itself and
  the next HB command will re-create hb.db from the hb.db.N files.  It
  saves a copy step and allows you to copy the HB program and key.conf
  to a remote destination and run selftest, for example.  But be aware
  that you are operating directly on the backup and any modifications
  will likely corrupt it.

- hb ls -l displays the link count, like regular ls -l, right after
  the file mode.

- stats: the display precision on a few statistics was increased for
  multi-TB backups.  Some new statistics were added to hb stats, for
  example, an estimation of the backup space saved by dedup.

- backup: if a directory /abc/def exists and the path /abc/def-ghi
  (either a file or directory) is excluded in inex.conf, backup would
  usually save /abc/def-ghi anyway.  This bug could occur with many
  punctuation characters other than dash.

- audit: if a program was still running, audit would show Finished: as
  the current time.  Now it displays a blank space.

- stats: if a backup was in progress, stats would sometimes display
  an error

- selftest: it's possible for a file's size to change while it is
  being backed up.  Usually the file is growing, but it can also be
  truncated.  If a file is truncated to zero bytes between the time
  that HB reads the file size and begins the file backup, selftest was
  incorrectly reporting this error:

    Error: logid 39528488 does not reference blocks: (pathname)

#1180 - Jan 11, 2015 - beta expires Mar 15, 2015


* changes to copying HB executable
* an error backing up file system flags is not fatal
* selftest enhancements
* bug fixes


- every version of the HB program used in a backup is now copied to
  the local backup directory, named hb.N, where N is the build number
  (not to be confused with hb.db.N files, which are database-related).
  This local copy is always created now, whereas before it was only
  created if remote destinations were configured.  If the
  copy-executable config keyword is True, each version is also copied
  to remote destinations.  You can delete old copies of hb.N manually,
  and they will also be deleted from remotes.

  IMPORTANT: do not delete hb.db.N files by mistake!
- backup: if file system flags cannot be read for a file or directory,
  backup displays an error message as before, but continues as if the
  flags were zero rather than giving up on the file or directory.
  File system flags are set with the chattr (Linux) and chflags
  (BSD/OSX) commands and are not widely used.  Related to this change,
  backup unnecessarily opened directories for reading on BSD/OSX
  systems, which could cause errors if there was only x access to a
  parent directory.

- selftest: more new tests - yay!

- selftest: if cache-size-limit is set, selftest -v3 or higher is
  used, and you answer No to the Continue? question, the arc cache was
  not cleaned up (lots of extra arc files left there).

- stats: bombed with a traceback if the initial backup was still
  running, but had done at least 1 commit.  Now it doesn't bomb, but
  the statistics are still not quite accurate because stats uses
  completed backups for some of its numbers, but all backups for other

- the first backup with this new version may upload many archives.
  The sync code was not always uploading archives that were already
  present on the destination if the size changed because of packing
  following rm/retain in older versions of HB.  This mainly occurred
  if packing was interrupted or a destination was down during packing.
  One of the new tests in selftest revealed this.

- audit: if auditing is enabled and an HB command is running, audit
  would display the current date & time for the running HB process
  under "Finished".  Now it displays a blank.

#1164 - Jan 4, 2015 - beta expires Mar 15, 2015


* get/selftest: cache planning is 3-5x faster
* --logid option removed from selftest
* selftest --fix improved
* hb sha256 should not require -c


- get/selftest: creating a plan to manage the local arc cache (when
  cache-size-limit is set) is 3-5x faster

- prior to #1090 (June 2014), an unusual race condition during backup
  could cause this selftest error:

    Error: for logid 3241850, pathid 2923104 is invalid.

  This could also cause selftest to abort, with an error message to
  run selftest.  Doh!  Now, selftest --fix corrects these errors by
  manufacturing a pathname based on the prior pathname backed up.  For
  example, if /data/file1 was the file backed up before the missing
  path, the missing path would be called /data/file1#path-N#hberror,
  where N is a unique number.

- the hb command 'sha256' should not require a -c option.

#1159 - Dec 26, 2014 - beta expires Mar 15, 2015


* cache planning bug fix


- get/selftest: if cache-size-limit is set, archives are being packed,
  and rm/retain are used, get and selftest (with -v3 or higher) could
  fail with an error like this:

    Planning cache...
    Traceback (most recent call last):
      File "/", line 172, in <module>
      File "/", line 857, in main
      File "/", line 298, in plan
      File "/", line 169, in iterprefetch
      File "/", line 50, in getvernum
    Exception: Blockid not found: 919621

  This bug was introduced in #1107, when packed archives got new names
  instead of keeping the old name.

#1157 - Dec 22, 2014 - beta expires Mar 15, 2015


* new export command
* selftest improvements
* init/rekey fix for Synology NAS


- a new command, export, creates a tar file of the HB database for
  customer support.  This contains file metadata such as filenames,
  info listed by ls -l, and HB metadata, but no user file contents.

  The advantage of export is:
  - it rekeys the database -- your backup key is not disclosed
  - a passphrase can be set, so if your export is intercepted,
    even with the key, the passphrase is still required
  - it clears backup keys stored in the database
  - it clears destination info if stored in the database
  - export's tar file is much smaller than tar alone can create

- selftest: more tests added

- hb init sets the key file, key.conf, to be read-only to prevent
  accidental deletion.  If the backup directory is actually on a
  Synology NAS mounted over AFP (AppleTalk), an error occurs because a
  read-only file can't be renamed in this setup.  HB was changed to
  reset the permissions before the rename.
  NOTE: rekey still does not work in this configuration.  A better way
  to setup a NAS is to use a local backup directory with -c, setup a
  Dir destination to the NAS in dest.conf, and set cache-size-limit
  with hb config to avoid having a local copy of the backup.

#1146 - Dec 15, 2014 - beta expires Mar 15, 2015


* bump expiration date (this time for real!)
* selftest improvements
* redirecting HB output to a file works better
* backup, ZFS, ACLs
* rm/retain: bug fix
* dedup table >= 2GB bug fix


- backup: apparently the backup expiration date did not get bumped
  back in October, though it says it was in the change log.  Apologies
  to everyone for such a silly mistake.

- selftest code has been reorganized and streamlined in this version,
  to better accomodate all the new tests

- when HB output is redirected to files, for example:

    hb backup -c xxx / 2>&1 >hb.log

  normal output was buffered differently than error output, so the log
  file didn't look right: error output and regular output weren't
  interleaved correctly.  This also caused weirdness if output was
  sent to syslog, because output was dumped all at once at the end of
  the program.  Now log files should look like terminal sessions.
  NOTE: this may be a bit slower if a lot of output is sent to a
  file on a remote file system.

- backup: HB does not support ACLs on ZFS, NFS4, and Windows, and
  issued 2 error messages:

    Unable to read ACL: Invalid argument: /
    Unable to read ACLs on this filesystem (zfs/nfsv4/Windows?)

  This caused the error count to always be at least 2, which made
  backup monitoring more difficult.  Now, backup will display only one

    Unable to read ACLs on this filesystem (zfs/nfsv4/Windows?): /

  and will not bump the error count on this error.

- retain with -x and -v would sometimes stop with an error:
    No pathname for pathid N; run selftest
  Selftest would complete without errors.  The bug was that retain had
  deleted a pathname because of -x, then wanted to display the
  pathname because of -v.  The simple fix was to display the pathname
  first, then delete it.

- backup: dedup tables >= 2GB could cause problems when written to
  disk, with a message (Linux) like:

    Error writing dedup table: wrote 2147479552 of 2147483640 bytes

  or on OSX, an Invalid argument error.  Now it works on both.

#1133 - Nov 24, 2014 - beta expires Mar 15, 2015


* IMPORTANT: backup option -n is obsolete
* IMPORTANT: run selftest -v2
* selftest improvements
* rm/retain: bug fix


- backup: on July 29th, 2009, the -n option was added to HB backup so
  that if retain was running next, the entire HB database was not sent
  twice: once from backup, then again from retain.  For a long time
  now, HB has been sending much smaller incremental database updates
  instead of a complete dump, so the justification for -n is gone.  If
  -n is used with backup, and retain decides not to delete any files,
  the database does not get sent at all - clearly not the intent.  In
  this rev, -n issues a warning and is ignored.  Early in 2015, the -n
  option will be removed altogether and backups will fail if it is
  still used, because it can be a dangerous option.

- selftest: a few more tests have been added for -v2, for very large
  backups.  Selftest also had a bug where it would not check files
  that had extended attributes + ACL in common with more than 64K
  other files.

  IMPORTANT: all customers should run selftest -v2 to check their

- when packing arc files, rm/retain would sometimes do this if
  cache-size-limit is set:

    Getting arc.32.0
    Packing arc.32.0 as arc.32.1
    Getting arc.34.0
    Packing arc.34.0 as arc.34.1
    Getting arc.34.0
    dest rsync: stopping because of errors
    dest rsync: Traceback (most recent call last):
      File "/", line 310, in loop
      File "/", line 469, in getcmd
      File "/", line 443, in getparts
      File "/", line 67, in retry
      File "/", line 116, in getfile

    Unable to download archive arc.34.0: Exception('destinations halted',)

    Traceback (most recent call last):
      File "/", line 146, in <module>
      File "/", line 322, in main
      File "/", line 181, in finish
      File "/", line 92, in compress
      File "/", line 156, in __init__
      File "/", line 189, in _open
      File "/", line 713, in fetch
    NoArchive: Archive does not exist:

  This was a database bug.  A workaround was added to HB.

- rm/retain: with small VMs, rm/retain might have "out of memory"
  errors during startup with rsync destinations, because HB is unable
  to fork rsync while the dedup table is loaded.  It is recommended to
  use workers 1 in dest.conf in low memory environments with rsync

#1125 - Oct 29, 2014 - beta expires Mar 15, 2015


* bump expiration date
* new config keyword "remote-update"
* create hb.db.n in dest sync command
* sort filenames better in dest ls command
* rm & retain also sync files, like backup
* new selftest -v2 checking
* bug fixes / minor changes
* updated docs


- when packing arc files because of a rm or retain, HB stores new,
  packed archives on remote destinations, then after all new archives
  are stored, removes the obsolete archives.  This order is necessary
  to preserve the integrity of the remote backup, but it also uses
  more remote disk space.  For users who are tight on remote disk
  space or have filled a remote, it may be necessary to remove old
  data first, then add new data.  The new config keyword
  "remote-update" can be set to either "normal" (the default), or
  "unsafe" to control the order of operations.

  IMPORTANT: if you set remote-update to unsafe and an operation is
  interrupted, the remote backup area may be in a temporary "bad"
  state, and doing a recover will fail.  The local backup directory is
  fine, and the next complete HB run should correct the remote backup

- if HB has been only creating a local backup, then a dest.conf is
  created, then hb dest -c xxx sync is done, hb.db.0 was not sent to
  the remote because it didn't exist.  The backup command normally
  creates these, but did not since there was no dest.conf.  The
  remote backup is not usable without hb.db.N files.  An interrupted
  backup would cause a somewhat similar situation.  Now, hb dest sync
  will create a new hb.db.N and send it to all remotes.

- the hb dest -c xxx ls command sorted files by destination and
  filename.  But filenames like arc.10.0 would sort before arc.2.0,
  which gets very confusing.  Now filenames are sorted as expected:
  all arc.0.N files, then arc.1.N, arc.2.N, ..., arc.10.N, etc.

- rm/retain: if a disk full occurred on a remote while packing
  archives, the next HB command could get stuck on "Getting arc.X.Y"
  (the old archive that was packed) with an error that it didn't
  exist.  This bug was introduced in the Sep 10th release.  Running
  selftest -v2 --fix will correct this.

- rm/retain: previously rm & retain did not do a local->remote sync
  like backup does when it starts.  Now they do a sync when finished.

- selftest: a few new tests were added for -v2 and above.  -v2 is an
  important selftest level because it can be used on very large
  backups without having to download archives if they are not local.
  One new test makes sure that all arc files are either local or on at
  least 1 remote destination.  Note that HB does not know whether
  files are really on remotes, but only that it successfully sent them
  in the past and they should still be there.

- backup: previously, the backup command would remove partial or
  uncommitted archive files when it was interrupted, for example, with
  a ctrl-c.  This can cause race conditions and destination errors,
  because the files being removed may already be queued or open for
  transmit.  Now, backup will just exit, and extra archive files will
  be removed at the end of the next backup, rm, or retain.

- display a message when creating hb.db.N files.  It can take some
  time to create these files if there is a lot of backup history, even
  if only a single file is backed up, and it's not obvious what is
  happening during the delay.

- when HB syncs destinations, it does a better job of removing .tmp
  files from remotes (affects dir, ftp, and ssh destinations)

- when a specific destination is cleared with hb dest -c xxx clear
  ddd, HB checked to make sure it was not deleting the only copy of a
  file, but it would delete files if they were on other destinations.
  Now, HB will not delete any files from a destination if it contains
  the only copy of any file.

- security doc updated: dest.db sent directly to remotes

#1107 - Sep 10, 2014 - beta expires Dec 15, 2014


* updated expiration date
* webdav: add digest authorization
* webdav: enhance error recovery
* rm/retain: fix "out of memory" errors in small VMs
* rm/retain: eliminate bad transient remote states if interrupted
* dest.db copied to remotes directly


- webdav: added digest authorization.  Currently HB uses Basic
  authentication first, then Digest authentication if Basic fails.
  This is not ideal, because it defeats the "security" enhancements of
  Digest authentication.  If security is a concern, use the secure
  keyword to enable ssl.

- webdav: some dav servers only allow use of creditials for a certain
  amount of time before throwing an error.  HB will now recover from
  these credential timeout errors, though it will still show up as a
  retry.  Ideally, these should be handle more gracefully by HB so
  they don't look like errors.

- rm/retain: release dedup table early to help prevent running out of
  virtual memory in small VM environments with multiple rsync workers

- rm/retain: "packing" is the recovery of deleted space within archive
  files and is performed automatically based on config settings during
  rm and retain.  Previously, HB would overwrite arc files during
  packing and then send them to remote destinations.  But between
  sending a packed archive and sending hb.db.n, the remote backup was
  in a transient, broken state.  If rm or retain was interrupted
  during this time, the remote backup would be broken (ie, a recover
  would fail) until the next backup, rm, or retain automatically
  corrected it.

  Now, packed archive files are created with new names and the old,
  unpacked archives are removed afterwards.  This prevents the bad
  transient state if HB is interrupted.  The downsides of this new
  method are that more space will be temporarily required on the
  remote side, equal to the size of all packed archives, and rsync
  will no longer be able to use its fast "delta transmission" to
  upload a packed archive.

  There were a couple of other cases of potentially bad transient
  states on remote destinations that have also been corrected.  An
  interrupted recrypt still leaves the remote backup in a bad state
  until the recrypt is finished, but this is documented in recrypt.

- previously, dest.db was compressed & signed before sending to
  remotes.  Since dest.db is always encrypted, there was little
  security benefit to this and it was confusing that a remote dest.db
  file was very different from a local dest.db file with the same
  name.  Now, dest.db is copied without modification and is directly
  usable as a dest.db file if you have the correct key.

  NOTE: the local and remote dest.db may not match if HB removes files
  after dest.db is copied.  This is normal and expected.

#1100 - Jun 29, 2014 - beta expires Sep 15, 2014


* add WebDAV support


- WebDAV remote destinations are now supported.  The destination type
  is either dav or webdav.  There is a new doc file in
  doc/dest.conf.examples to explain the options for DAV destinations.

#1090 - Jun 9, 2014 - beta expires Sep 15, 2014


* bump expiration date
* new recover option, -n, bypasses arc file downloads
* new symlink keyword for Dir destinations
* bug fixes


- recover: a new option, -n, makes restoring a few files faster in a
  disaster recovery situation.  If the local backup directory is lost,
  recover is used to download files from a remote destination.  But
  recover downloaded all archive files, unless arc-cache-limit was
  set.  For large backups, this could take a very long time.  Now, the
  -n option can be used, and no archives will be downloaded, the get
  command can be used to restore the required files, and only the
  archives needed will be downloaded.  Later, recover can be used
  again without -n to recover all archive files.

- Dir destination: previously, HB would try to do a symlink to "fetch"
  files from a remote Dir destination.  This is useful when
  cache-size-limit is set and Dir destinations are actually remote
  drives, like Google Drive, Dropbox, etc., because instead of
  downloading the whole arc file from the remote service, HB can fetch
  only the blocks it needs for a get command.  But it can also be
  confusing if you don't realize what is happening.  So a new symlink
  keyword has been added.  If symlink is not present or is False, no
  symlinks will be used.  If symlink is present with no value or is
  True, symlink will be attempted.  If symlink fails, HB falls back to
  a regular copy.

- recover: dest.db is always downloaded, even if it already exists in
  the recovery directory.  A customer reported problems with recover,
  but the real problem was that recover had been run earlier, a backup
  was run (deleting files from the remote directory), and recover was
  run a 2nd time in the recovery directory using a stale dest.db.
  This caused a hang problem when non-existant hb.db.* files were

- recover: hb.db is always rebuilt, even if it already exists in the
  recover directory.

- recover: hb.db is rebuilt from hb.db.N files.  If recover is
  interrupted and restarted, it now will not download hb.db.N files
  that already exist and are the correct size.

- recover: previously, recover would rename existing files with a .old
  suffix.  Now, existing files that are going to be overwritten are
  deleted instead.

- recover: customer reported hash mismatches after a recover.  When
  recovering into a directory already containing arc files, verify that
  any existing arc files are the correct size, and if not, download the
  file again.

- customer reported selftest error (-v2 or higher):

    Error: for logid 3241850, pathid 2923104 is invalid

  The customer kept daily backup logs to help solve the problem.  A
  race condition triggers the bug:

    1. backup a directory d containing file x
    2. delete file d/x
    3. backup directory d again
    4. create file d/x again
    5. backup directory d again:
       5a. hb reads the directory to get a list of files
       5b. file d/x is deleted by another process
       5c. hb tries to backup file d/x and gets a stat error

  Afterwards, selftest will throw the error about an invalid path.
  This backup bug is now fixed.

- related to above error, if the same thing happens with a nested
  directory x instead of a file, it caused this selftest error:

    Traceback (most recent call last):
      File "/", line 151, in <module>
      File "/", line 619, in main
      File "/", line 251, in showpath
      File "/", line 112, in getpathname
    Exception: No pathname for pathid 5; run selftest

- Amazon Glacier: document that rate limiting is not supported

#1083 - Mar 12, 2014 - beta expires Jun 15, 2014


* update expiration date

#1079 - Dec 23, 2013 - beta expires Mar 15, 2014


* S3 example dest.conf update


- a note was added to the S3 example dest.conf for Google Storage,
  explaining that Developer Keys have to be generated to use the
  S3-compatible API with Google Storage.

#1078 - Dec 5, 2013 - beta expires Mar 15, 2014


* security doc update
* extend backup expiration date to March 15, 2014


- the security document was updated to mention loading dest.conf into
  hb.db to avoid having plaintext passwords in dest.conf, and describe
  modifications to the key generation procedure when the entropy pool
  is exhausted (Linux only).

#1075 - Oct 21, 2013 - beta expires Dec 15, 2013


* bug fixes


- customer reported error:
  -- rebooted during a backup
  -- this leaves hb.db-journal file (transaction log)
  -- run hb upgrade to get latest version of hb
  -- this version required a database upgrade
  -- database upgrade would not work because journal existed

     $ hb versions -c hb
     HashBackup build 1070 Copyright 2009-2013 HashBackup, LLC

     Current database rev: 13
     Upgrading database to rev: 14
     Warning: unable to audit command: Can't upgrade database with a
     transaction active
     Backup directory: /Users/jim/hbdev-1035/hb

     Current database rev: 13
     Upgrading database to rev: 14

     Traceback (most recent call last):
       File "/", line 171, in <module>
       File "/", line 139, in main
       File "/", line 138, in opendb
       File "/", line 379, in upgradedb
     Exception: Can't upgrade database with a transaction active

  HB was changed to clean up the journal before a database upgrade.

- recover: with Glacier destinations, if a recover is in progress, the
  recover is aborted, then it is restarted, HB tries to display the
  archives already in progress and when they started retrieval.  This
  print message caused an error on Linux, either an import error for
  _strptime, or a seg fault.

#1071 - Aug 10, 2013 - beta expires Dec 15, 2013


* better messages
* bug fixes


- the help for init has been improved

#1070 - Aug 5, 2013 - beta expires Dec 15, 2013


* better messages
* bug fixes


- mount displays a better error message when a mountpoint is already
  in use, a better message when the backup has been mounted, and
  explains how to abort the mount using Ctrl-\

- stats command was printing:
    4:1 reduction ratio of backed up files for last %d backups
  instead of:
    4:1 reduction ratio of backed up files for last 5 backups

- beginning with #1032, a fatal exception was raised when a
  destination had trouble starting, for example, a Dir destination was
  unavailable because a removable USB disk wasn't inserted.  This
  fatal error was not intended, and the effect is that you couldn't
  have destinations that were temporarily missing.

- in #1035, a feature was added to detect overwriting a remote backup
  area by accident.  But when doing an initial backup, this caused
  error messages to be displayed, usually 3, because HB kept trying to
  download the DESTID file when it did not (and should not) exist.
  The feature still works, but now the confusing error messages are
  not displayed.

#1062 - Jul 20, 2013 - beta expires Dec 15, 2013


* expiration date updated
* bug fixes


- backup: in #1059, saving / and /mnt did not save /mnt if it was a
  separate filesystem and -X wasn't used.  /mnt should have been saved
  because it was specifically mentioned on the command line.  A
  similar thing could happen with excluded files that were on the
  command line.

#1059 - Jul 10, 2013 - beta expires Sep 15, 2013


* database upgrade to dbrev 14 (may take a while)
* stats command runs faster on huge backups
* new backup statistic: # of files checked
* better error messages
* bug fixes


- NOTE: this rev will do an automatic database upgrade to dbrev 14
  when any HB command is used.  Extra statistics are maintained to
  speed up the stats command so it scales better for huge backups with
  millions of files.  The existing database does have to be scanned
  during the upgrade to initialize these new statistics, so the
  upgrade could take some time to complete - about the same time as
  the old stats command took to run once

- hb stats runs faster for huge backups and the "industry dedup ratio"
  will be more accurate for new backups

- backup: prints the number of files and bytes checked in addition to
  the number actually saved (because they were modified).  These
  numbers now include saved directories.  When /abc/def is the only
  file backed up, what actually gets saved is /, /abc, and /abc/def.
  Backup will say 3 paths were checked and saved whereas before it
  said 1 file was saved.

- mount: if an empty backup directory is mounted, it caused a
  traceback.  Now, an error message 'No backups yet!' is displayed.

- hb: if an invalid command is used, like hb xyz, hb could complain
  that the backup directory doesn't exist, when the real error is that
  the command is not recognized

- the error message displayed when the backup directory doesn't exist
  is more specific, advising to use the -c option if it wasn't used,
  or to use a different directory with the -c option.  It was
  confusing when the -c option was omitted by accident.

- when the backup database is newer than the hb program can handle, hb
  no longer recommends using the clear command.  Auditing and command
  restrictions (disable-commands config option) prevent using clear
  in this situation.

- backup: if a symlink to a mounted block device was used on the
  command line, backup would not check to see if the block device was
  mounted and display the appropriate warning.

- help command could cause a traceback:

    HashBackup build 1037 Copyright 2009-2013 HashBackup, LLC
    Traceback (most recent call last):
      File "/", line 51, in <module>
      File "/", line 556, in confdir
    misc.err: Backup directory doesn't exist, use hb init command: /root/hashbackup

- if a database upgrade fails with an error, the original database is
  re-installed.  But if the upgrade was aborted with Ctrl-c, the
  original database was not re-installed.

- in #1035, a new feature was added to prevent sending two backups to
  the same destination.  But if a destination is flaky (imap in this
  case), hb could report that the destination ID's did not match and
  you may be overwriting another backup, when the real problem is that
  the remote service had an issue and did not return the DESTID file.
  Error retries have been added to fix this.

- when there is a destination ID mismatch, the local and remote IDs
  are displayed to help determine the problem

- compare: could cause a traceback on ZFS, because ZFS ACls are not
  yet supported.  Now it displays a warning message like backup does.

- with Amazon Glacier, HB creates an associated bucket in S3.  But the
  location names for S3 and Glacier are sometimes not identical, and a
  traceback could occur:

    Traceback (most recent call last):
      File "/", line 76, in <module>
      File "/", line 1973, in main
      File "/", line 180, in init
      File "/", line 126, in initdest
      File "/", line 62, in startdest
      File "/", line 172, in init1
      File "/", line 185, in init1
      File "/boto/s3/", line 500, in create_bucket
    S3ResponseError: S3ResponseError: 400 Bad Request
    <?xml version="1.0" encoding="UTF-8"?>
    <Message>The specified location-constraint is not valid</Message>

  Now HB maps Glacier locations to corresponding S3 locations.

#1037 - May 30, 2013 - beta expires Sep 15, 2013

- when dest clear is used to delete files from a destination, delete
  the DESTID file too

- a bug fix in #1035 caused the upgrade command to fail with a traceback:

    # hb upgrade
    HashBackup build 1035 Copyright 2009-2013 HashBackup, LLC
    Traceback (most recent call last):
      File "/", line 51, in <module>
      File "/", line 556, in confdir
    misc.err: Backup directory doesn't exist, use hb init command: /root/hashbackup

  The upgrade command cannot be audited since it is not associated
  with a backup directory.  To work around this problem, create a
  ~/hashbackup backup directory if you don't already have one, as in
  this example, first showing a failed upgrade, then success:

    [jim@mb ~]$ hb upgrade
    HashBackup build 1035 Copyright 2009-2013 HashBackup, LLC
    Traceback (most recent call last):
      File "/", line 51, in <module>
      File "/", line 556, in confdir
    misc.err: Backup directory doesn't exist, use hb init command: /Users/jim/hashbackup

    [jim@mb ~]$ hb init -c ~/hashbackup
    HashBackup build 1035 Copyright 2009-2013 HashBackup, LLC
    Backup directory: /Users/jim/hashbackup
    Permissions set for owner access only
    Created key file /Users/jim/hashbackup/key.conf
    Key file set to read-only
    Setting include/exclude defaults: /Users/jim/hashbackup/inex.conf

    VERY IMPORTANT: your backup is encrypted and can only be accessed with
    the encryption key, stored in the file:
    You MUST make copies of this file and store them in a secure location,
    separate from your computer and backup data.  If your hard drive fails,
    you will need this key to restore your files.  If you setup any
    remote destinations in dest.conf, that file should be copied too.

    Backup directory initialized

    [jim@mb ~]$ hb upgrade
    HashBackup build 1035 Copyright 2009-2013 HashBackup, LLC
    You already have the latest version

#1035 - May 17, 2013 - beta expires Sep 15, 2013


* disable destinations to prevent overwriting an unrelated backup
* new dest setid command to "marry" a destination to a backup
* bug fix: audit not working if -c wasn't used on command line


- a new file, DESTID, is stored on each destination to prevent
  accidentally overwriting a remote backup area.  If this file does
  not match the backup, an error message is displayed and the
  destination is disabled (for this run of HB):

    dest s3: destination ID mismatch - you may be overwriting another
    backup! Verify destination is correct; use hb dest setid s3 to
    disable this warning.

  This error will not occur during normal operation, so if you see it,
  pay close attention to make sure you aren't overwriting a backup.
  The error will occur for example if:

  -- you do a backup with dest.conf, delete the local backup
     directory, and do another backup using the same dest.conf

  -- you configure 2 destinations with different names, but they both
     point to the same remote storage area

  -- you configure 2 different backups, maybe on different machines,
     to point to the same remote storage area

  HB may be slightly slower to start because the remote DESTID file
  has to be checked for every destination.  An error message might be
  displayed on your first backup since DESTID will not exist yet.

- a new dest subcommand, setid, sets DESTID on a remote destination(s)
  to match the current backup.  Before doing this, make sure you are
  not overwriting an active backup.

- audit was not working when -c wasn't used, for example, when the
  HASHBACKUP_DIR environment (shell) variable was set or one of the
  default backup directories /var/hashbackup or ~/hashbackup was used.

- dest: display unrecognized subcommand in error message

#1027 - May 10, 2013 - beta expires Sep 15, 2013


* get merges restores into existing directories rather than replacing them
* new get option --delete deletes existing files not restored (like before)
* bug fix: dest clear


- get: a new option, --delete, deletes existing files in restored
  directories that are not in the backup.  This is similar to rsync's
  --delete option, allowing HB to "sync" a directory like rsync rather
  than just add to it.  With -v2 or higher (the default is -v2), the
  names of the deleted files and directories are printed.

- get: previously, get restored into a temporary file or directory,
  deleted the original file if it existed, and renamed the temp file.
  When restoring directories, often it makes more sense to merge the
  restored directory with an existing directory (that's what tar
  does).  Especially with ZFS, when restoring BSD jails, a user's home
  directory might contain several different filesystems.

  Now, get restores directly to the target file or directory.  If the
  target already exists and is a directory, get will merge the
  restored contents with the existing contents, overwriting any
  existing files.

- dest: clear command could cause a traceback:
    Traceback (most recent call last):
      File "/", line 99, in <module>
      File "/", line 162, in main
    TypeError: cannot concatenate 'str' and 'int' objects

#1020 - May 8, 2013 - beta expires Sep 15, 2013


* dest erase command is replaced by new dest unload command
* new dest clear command removes files from a destination
* new dest sync command syncs local backup to all remotes


- the dest subcommand erase has been eliminated.  To move dest.conf
  from the database to a text file, use the new unload subcommand (see
  next change)

- a new dest subcommand, unload, writes the dest.conf stored in the
  database to a text file and removes it from the database.  If no
  pathname is given, the file is written to dest.conf in the backup
  directory.  If the output file exists, HB prompts whether it's okay
  to remove it.  dest load and dest unload are now opposites.

- a new dest subcommand, clear, removes all files from destinations
  specified, usually because you are no longer using that destination
  and are going to remove it from dest.conf after deleting files
  stored there.  This command will not remove a file if it is the only
  copy available, ie, there is no local copy and no other remote copy.

- a new dest subcommand, sync, ensures that all remote destinations
  are in sync with the local backup directory.  This always happens at
  the start of each backup, but this dest command can force it at
  other times.  Previously, backing up a small or dummy file like
  /dev/null would force a sync.

- a new dest subcommand, ls, displays a listing of all files stored at
  each destination

#1019 - May 2, 2013 - beta expires Sep 15, 2013


* recover bug fix


- a change in #1015 was for destination threads to shutdown before the
  main program.  The purpose of this was to avoid a race condition
  where a thread wokeup while the main program was dying.  This is a
  hard condition to duplicate.  The symptom is that when the main
  program is exiting, something like this is displayed:

    No errors
    Unhandled exception in thread started by
    Error in sys.excepthook:

    Original exception was:
    Unhandled exception in thread started by
    Error in sys.excepthook:

    Original exception was:

  Since this just happened to me with #1018, the fix in #1015 didn't
  work.  And, the #1015 fix also made recover not fetch arc files,
  which obviously isn't a good thing.  This release backs out #1015
  and fixes the recover bug.

#1018 - May 1, 2013 - beta expires Sep 15, 2013


* get --todev accepts symlinks for block device targets
* get accepts symlinks for block devices to restore


- get: on Linux, disk partitions are often symbolic links to the
  actual block device.  When the symbolic link name is used for
  backup, HB also saves the actual block device contents as a separate
  path.  With get's --todev option, the symbolic link could not be
  used since it is not a block device.  Now, get will accept a
  symbolic link with --todev if the symlink is pointing to a block
  device.  A warning message is displayed showing the symlink target.

- get: related to the above, if a symlink name is used on the get
  command line, and at backup time, this symlink pointed to a raw
  device, HB will restore the raw device.  If the symlink does not
  already exist or has a different value than it did at the time of
  the backup, HB will only restore the symlink.  If the get is
  repeated, HB will restore the block device.  A warning message is
  displayed about restoring a block device instead of the symlink.

#1012 - April 24, 2013 - beta expires Sep 15, 2013


* temporarily disable closing of idle ftp connections


- temporarily disabled ftp idle connection handling, because it
  sometimes causes spurious Python errors when the backup program
  stops.  Backups are fine; the errors are caused by ftp idle timers
  running while the main program is dying.

#1011 - April 21, 2013 - beta expires Sep 15, 2013


* new ftp destination keyword "restart"
* ftp restarts downloads too
* ftp keyword Dir is optional
* ftp enables keepalives
* new ftp destination keyword "idle"
* bug fixes


- ftp destinations have a new True/False keyword, "restart".  If not
  present, the default is True and ftp will try to restart failed
  uploads and downloads.  If the restart has trouble, HB will print an
  error message like:

     size mismatch after restart: file is 4976624 bytes, restarted at
     1048576 bytes, uploaded file is 3928048 bytes; disabling restart

  If you see this message, restarts should probably be disabled by
  adding "restart False" to your dest.conf.  Use debug 1 to see the
  ftp conversation to help troubleshoot restart problems.

- ftp can restart downloads as well as uploads with most ftp servers

- ftp: if the Dir keyword is present, a cd occurs to this directory.
  Otherwise, backups are sent to the initial login directory.

- ftp keepalive: routers, firewalls, VMs, and other intermediate
  devices sometimes drop the control (command) connection while a long
  file transfer is in progress.  Then after the file transfer
  completes, a timeout occurs and HB thinks the file was not sent.
  Then a restart (usually) occurs to complete the file transfer.
  Setting keepalive may help prevent this, though whether it works
  depends on your operating system's keepalive settings (how often
  keepalive packets are sent) and how long it takes the intermediate
  device to timeout the connection.

- ftp destinations try to keep connections to a server open for a
  while after each operation to save making another connection.  The
  idle keyword specifies how long in seconds the connection can be
  idle before HB closes it.  The default is 15 seconds.

- ftp: restart didn't work with the bsd ftp server

- if multiple destinations were setup and an old file needed to be
  sent to n destinations, it was sent n times to each destination

#1001 - April 16, 2013 - beta expires Sep 15, 2013


* bug fixes


- restart did not work on some ftp servers where it could have worked;
  now it does

- restart is temporarily disabled until it can be tested on more ftp

#1000 - April 16, 2013 - beta expires Sep 15, 2013


* ls shows /dev symlinks
* ftp restarts uploads
* dest keyword "off"


- on Linux, block devices (logical volumes) are often symbolic links
  that are user-defined names for a "real" block device.  When these
  symlinks are used on the backup command line, for example,
  /dev/mylv, the symlink is saved and also the actual block device,
  eg, /dev/dm-1.  When backups occur for several logical volumes, an
  ls listing becomes confusing because it's hard to tell which symlink
  goes with which actual device.  Using ls -l shows the symlink
  target.  In this release, the symlink target is always displayed if
  the pathname starts with /dev/ (and the user has cd access to the
  directory; this is a permission requirement to display any -l info).

- ftp destinations now restart uploads if the ftp server software
  implements the REST and SIZE commands; most ftp servers implement
  these.  Before restarting an upload, HB verifies that the first 1K
  of the local and remote files are equal, the file size must be 100K
  or greater, and the partial upload must be 100K or greater.

- a new destination keyword, "off", means to disable the destination.
  It can be re-enabled by deleting the "off" line, or changing it to
  "#off", which is a comment.  This is useful for testing and
  travelling, when you may want to temporarily disable a destination.
  It's easier than commenting out all the lines describing a
  destination.  A disabled destination prints a warning message.

#997 - April 13, 2013 - beta expires Sep 15, 2013


* can't repeat keywords in a destination
* new ftps destination (more secure)
* ftp debug keyword
* multiple transactions on ftp connections
* bug fix


- INCOMPATIBILITY NOTE: in previous releases, it was okay to repeat a
  keyword in a destination; the last keyword was used.  Now, it is a
  fatal error to use the same keyword more than once in a destination.
  Repeating a keyword can cause subtle errors, ie, you think you are
  sending files one place, but actually, they are going somewhere else
  because of a repeated Host keyword.  You can leave repeated keywords
  in the dest.conf file, with all but one commented out with a # mark.

- a new destination type, ftps, supports FTP-TLS (Transport Layer
  Security), also called FTP over SSL.  This is not FTP over an ssh
  tunnel, also called Secure FTP.  And it's not sftp, which is a
  separate file transfer protocol built in to ssh.  The names are a
  bit confusing...

  FTP uses a control connection to send authentication and commands to
  the remote FTP server.  A data connection is opened during file
  transfers.  HB's ftps destination uses SSL for the command
  connection so your userid, password, and commands are encrypted
  while talking with the FTP server.  HB does not use SSL for the data
  connection since the backup files being transferred are already

  The ftp destination type (without the s) is still available for
  internal servers, or servers that do not support TLS.  If the ftps
  type is used with an FTP server that doesn't support TLS, this
  message is displayed and the destination halts:

    dest ftp: unable to start: [error_perm] 500 AUTH TLS: command not understood.

  Your backup still runs and other destinations are unaffected when
  one destination halts or doesn't start.

- ftp: the debug keyword with a positive number displays the FTP
  conversation.  Higher numbers display more output, but usually debug
  1 shows enough.

- ftp: rather than making a new connection for each file, the ftp
  connection is left open and reused, reconnecting only when

- when maxsize is used on a destination to limit file sizes, the
  permissions on the files created on the remote were rwx-r-xr-x, but
  should have been rw-r--r--.

#988 - April 7, 2013 - beta expires Sep 15, 2013


* S3 supports multipart uploads
* add fsyncs for XFS zero-length file bug
* imap uses 50% less memory
* bug fix


- S3 destinations have a new 'multipart' keyword, True or False; no
  value means True.  Multipart uploads are enabled by default for
  Amazon S3 and Dreamhost Dreamobjects.  With multipart, instead of 3
  workers sending 3 different files to S3, they all work on the same
  file in parallel.  This helps minimize cache stalls during backup
  when cache-size-limit is used, and tests show more consistent and
  predictable upload rates to S3.

- added fsync calls in a few places to prevent XFS creating
  zero-length files if the system crashes

- imap: reduced memory usage 50%, though imap destinations still
  require twice the file size during send or receive to encode the
  file.  Use a low number of workers for imap as they will use a lot
  of memory.  In tests with, 2 imap workers upload as fast as
  6 workers, and 2.5x faster than just 1 worker can upload.

- if an archive was packed but dest.db could not be written because of
  an unusual permission problem, rm / retain would stop with an error
  (correct).  But when the permission problem was corrected, the
  packed archive was not sent to destinations during the next
  backup (incorrect).  This is fixed.

#971 - March 31, 2013 - beta expires Sep 15, 2013


* bug fixes


- backup: sync could fail with an error "No destinations in dest.conf
  contain arc.x.x", even though a destination did have the file.  To
  trigger this, multiple destinations are setup, backup to them,
  delete one of the destinations, add a new destination, then do a
  backup, causing a sync, and triggering the error.

- selftest: with small backups that are in memory, ie, don't have to
  read from disk, a race condition could cause this traceback:
      File "/", line 144, in <module>
      File "/", line 682, in main
      File "/", line 201, in checkallblocks
    NameError: global name 'shaq_seq' is not defined

#961 - March 29, 2013 - beta expires Sep 15, 2013


* bug fixes


- destination initialization has changed.  Some initialization, like
  checking a hostname with DNS, was being done in every worker thread
  instead of just once, and a fatal error would occur in every worker
  instead of just one

- clear: could cause a traceback when destinations are configured
  because of a race condition while deleting dest.db:

    File "/", line 231, in loop
    File "/", line 408, in rmcmd
    File "/", line 161, in getinfo
    File "/", line 214, in __init__
    File "/", line 230, in opendb
  OSError: [Errno 2] No such file or directory: 'dest.db'

#956 - March 28, 2013 - beta expires Sep 15, 2013


* mount runs in foreground by default
* bug fixes


- recover: recent improvements in the Cloud Files driver caused
  recover to fail with this traceback:
    Traceback (most recent call last):
      File "/", line 626, in <module>
      File "/", line 279, in main
      File "/", line 107, in getfile
    AttributeError: cf instance has no attribute 'container'

- mount: because of ongoing issues with mount putting itself in the
  background, mount now runs in the foreground by default and the
  --debug option has been removed.  To run mount in the background,
  use this to suppress all output:

     $ hb mount -c backupdir mnt >/dev/null 2>&1 &

  or this if you want output in mount.out and errors in mount.err:

     $ hb mount -c backupdir mnt >mount.out 2>mount.err &

  or this if you want output and errors in mount.out:

     $ hb mount -c backupdir mnt >mount.out 2>&1 &

#954 - March 27, 2013 - beta expires Sep 15, 2013

- display file size, transfer time, and transfer rate after files are
  copied to a destination

- Rackspace Cloud Files: new destination keyword 'servicenet', True or
  False, accesses Cloud Files over the local Rackspace network.
  (faster, no download charges)

- Rackspace Cloud Files: random BadStatusLine errors and/or Broken
  pipe errors are less likely

#947 - March 26, 2013 - beta expires Sep 15, 2013

- added an optional timeout destination keyword.  The default value is
  1800, or 30 minutes.

#946 - March 26, 2013 - beta expires Sep 15, 2013

- Cloud Files, OpenStack: HB was requesting a 30 second timeout, but
  the Python Cloud Files library was not passing this correctly and the
  authentication timeout was actually 5 seconds.  If the Rackspace
  authentication servers got busy or your connection was busy doing
  other things, this 5 second timeout could easily be exceeded and
  authentication would fail, leading to unnecessary retries.

- Cloud Files, OpenStack: when certain errors occurs, such as a
  timeout, hb was reusing the socket in the retry loop instead of
  opening a new one.  It could be argued that the Cloud Files library
  should clean up when a socket error occurs after an HTTP request.
  Since it doesn't, the hb retry loop was not working for these types
  of errors.  The traceback would show CannotSendRequest instead of
  the real error, which was a timeout.

#941 - March 24, 2013 - beta expires Sep 15, 2013

- Rackspace Cloud Files (destination type 'cf'): add 'location'
  keyword, with values of either us or uk.  If not specified, the
  default is us.  This is Rackspace specific, only applies when the
  destination type is 'cf', and does not apply to other OpenStack

- OpenStack (destination type 'os'): add REQUIRED 'authurl' keyword
  to specify the authentication endpoint to authenticate with
  non-Rackspace OpensStack object stores.  The version 1.0 API is
  used, so the url looks like (these are RackSpace's endpoints):''

#938 - March 22, 2013 - beta expires Sep 15, 2013

- compare: add -X option to cross mount points, like backup

#937 - March 20, 2013 - beta expires Sep 15, 2013


* new -v retain option
* bug fixes
* note that clear command resets config options


- retain has a new option, -v, to display the files being deleted.  It
  shows the file backup time, filename, version, and retain option
  that caused the file to be deleted.  For example:

2012-10-22 02:30:12 /.DS_Store [r468] -s
2013-03-13 00:39:11 /private/var/log/asl/2013.03.12.U0.G80.asl [r589] -x

- recover: if dest.db couldn't be fetched, recover was giving this
  traceback instead of the reason dest.db couldn't be fetched:
    Traceback (most recent call last):
      File "/", line 124, in <module>
      File "/", line 282, in main
    NameError: global name 'destdb' is not defined

- Dir destinations: when recovering files, dir destinations try to use
  symbolic links because they are much faster than copying files.  But
  some filesystems don't support symlinks and a traceback occurred:

    Traceback (most recent call last):
      File "/", line 124, in <module>
      File "/", line 279, in main
      File "/", line 36, in getfile
    OSError: [Errno 95] Operation not supported

  Now, recover (and get, selftest and mount if cache-size-limit is
  set) will copy the file when symlink fails.

- Amazon Glacier uploads failed with the error:
     dest glac: error #1 of 3 in send arc.18.1: 'Layer2' object has no attribute 'close'

- backup: if cache-size-limit was set, this traceback could occur:
    Traceback (most recent call last):
      File "/", line 75, in <module>
      File "/", line 1973, in main
      File "/", line 684, in sync
      File "/", line 376, in initcache
    UnboundLocalError: local variable 'arcbytes' referenced before assignment

- get: if cache-size-limit was set, a directory was being restored,
  and -r was used to restore an older version, this traceback could
    Traceback (most recent call last):
      File "/", line 103, in <module>
      File "/", line 1087, in main
      File "/", line 868, in plan
      File "/", line 922, in prefetch
    TypeError: not all arguments converted during string formatting

- when an error occurred with a capitalized command line option, eg,
  -D with no size, the error message would be:
      Argument -d: expected one argument
  instead of:
      Argument -D: expected one argument

- mount: when cache-size-limit was set, mount was run in the
  background (without --debug), backup file data was referenced
  through the mount, and a remote archive had to be downloaded, the
  background hb mount process could die and the file access would then
  fail with an input/output error

- when cache-size-limit was set to a small number, like 3, it means 3x
  the arc-size-limit.  But the limit was actually higher because a 4MB
  fudge factor was added.  For large archives, this doesn't matter,
  but for small caches and smallish archives, it is more noticeable so
  the fudge factor has been removed.

- when VMWare shared folders are used as the -c backup directory, the
  timestamps (mtime) of files hb creates in the backup directory can
  change during a transfer to a remote destination.  This isn't
  supposed to happen - it's some weirdness in VMWare's hgfs - and hb
  complains about it:

    dest rsync: stopping because of errors
    dest rsync: Traceback (most recent call last):
      File "/", line 198, in loop
      File "/", line 281, in sendcmd
    Exception: file changed during transfer: 1363391280.0 != 1363391281.23

  Now this comparison isn't done at all.  Instead, inode number and
  size are compared before and after the transfer.  This is to catch
  the odd case of someone sending the backup on top of itself, which
  has actually happened by mistake.

- clear: this command resets all config options to default values.  It
  has always done this, but now a note is printed.  The clear command
  should be replaced by init.  To just remove all backup data, you can
  use rm /, though this is much slower than clear.

#922 - March 7, 2013 - beta expires June 15, 2013


* rm is 2-3x faster
* bug fixes


- rm is 2-3x faster, especially with large files with a lot of dedup
  blocks.  This also speeds up retain when it is removing old
  versions of files.

- on 32-bit systems, stats could fail with a traceback:
    Traceback (most recent call last):
      File "/", line 156, in <module>
      File "/", line 106, in main
    OverflowError: Python int too large to convert to C long

- if the hash.db file didn't exist, stats would fail with a traceback:
    Traceback (most recent call last):
      File "/", line 156, in <module>
      File "/", line 187, in main
    TypeError: 'NoneType' object is not iterable

#919 - March 6, 2013 - beta expires June 15, 2013


- release #918 had a backup bug that could cause warnings about files
  changing size during the backup.  This warning is misleading: what
  actually happened is that the file wasn't backed up correctly
  because of the new feature in #918 where variable-block dedup was
  enabled for single-core backups.  This warning only occurred on
  multi-core systems.

  The backup for these files with warnings is no good, and will also
  cause selftest errors on these files.  Because it is a thread timing
  / coordination problem, you may or may not have errors.  One test
  system had the problem, another did not.  It's recommended to
  completely remove any backups made with #918 with hb rm:

  $ hb rm -c backupdir -r640 --force

  Your -r version number(s) will be different of course.  Use hb
  versions to see which backups were created with #918 and remove all
  of them.  Your next backup with #919 will re-save these files

#918 - March 5, 2013 - beta expires June 15, 2013


* backup -p0 supports variable block dedup
* get & selftest do 4-hour delay with Amazon Glacier
* recover changes for Amazon Glacier
* bug fixes


- backup with -p0 or on a single CPU system runs in a single process
  (thread).  -p0 can be useful on multi-CPU systems to restrict hb's
  system load.  Previously, running in a single process also disabled
  variable-block dedup.  Now backup supports VB dedup with single CPUs
  and multiple CPUs.  If you had been running with -p0 or on a single
  CPU system, your next backup may be larger because files that had
  been saved with a fixed block size will be saved with a variable
  block size.  Because of the block size change, files will not dedup
  well on this next backup, but will after that.  To prevent this
  dedup gap, use -p0 -B1m, to disable variable-block dedup.

  NOTE: variable-block dedup with -p0 (or on a single CPU) usually
  takes twice as long as fixed-block dedup.  Add -B1m to disable
  variable-block dedup with a single CPU if you are more concerned
  about performance than VB dedup.

- Amazon Glacier requires two fetches separated by 4 hours to retrieve
  files.  HB's recover command was doing this, but get and selftest
  were not: if your backup data was only located on Glacier, then
  restoring a few files required an hb get (which failed with an
  error), waiting 4 hours, then retrying the get (which would work
  this time).

  In this release, if you are using Glacier as your first destination
  and running with cache-size-limit set so that your archive files are
  mostly remote, hb get will figure out which archives it needs to do
  your restore, request them all once, delay 4 hours, then start your
  restore, requesting archives again as they are needed.

  IMPORTANT: the HB recover command has lots of options for doing
  paced retrievals, but get and selftest do not have those yet;
  everything is retrieved in 4 hours.  This can be very expensive for
  very large restores, so if you need to restore everything, it may be
  cheaper to set cache-size-limit to -1, use recover to fetch all of
  your archives from Glacier to your backup directory with whatever
  pacing you need to meet your cost objectives, then do your get
  command to restore from the local backup directory.

- (Amazon Glacier) HB has two normal cache configurations:
    1. cache-size-limit is -1: local copy of all archives
    2. cache-size-limit >= 0: some archives may be on remotes
  A special case of #1 is when an archive file is missing, ie, it was
  manually deleted.  HB will download the missing archive when it is
  needed.  For Amazon Glacier, this requires two retrievals, separated
  by 4 hours.  That wasn't happening, but now it is.  Please note,
  this is a very inefficient way to run HB with Glacier, because every
  individual archive needed will require a 4 hour delay.  It's much
  better to set a cache-size-limit; then HB will request all archives
  needed for a restore, wait 4 hours, then request the archives again
  and do the restore.

- the recover --dl option did accept KB/s and Kb/s for bytes per
  second and bits per second (upper or lower case B), but expected the
  K, B, or G prefix to be uppercase and the /s, /m, /h suffix to be
  lower case.  This is too confusing, so now, the --dl option is
  changed to all lowercase, and it's no longer possible to specify a
  rate in bits per second.

- if --dl 1 was used (rate of 1 byte per second), recover would loop
  forever because there wasn't enough bandwidth to retrieve the
  largest arc file.  Now it will display a message and ignore very low
  rates since they are equivalent to Option 4, the "cheap" download
  option.  A separate change triggers a fatal error if this looping
  situation occurs in the future.

- a bug was introduced with the new maxsize destination keyword that
  would generate a traceback similar to this:

    Getting arc.0.7 from drobo-vaio-rsync
    Unable to download archive arc.0.7: AttributeError("rsync instance has no attribute 'getparts'",)
    Traceback (most recent call last):
    File "/", line 75, in <module>
    File "/", line 1941, in main
    File "/", line 779, in sync
    File "/", line 408, in condput
    File "/", line 141, in __init__
    File "/", line 174, in _open
    File "/", line 676, in fetch
    NoArchive: Archive does not exist: /root/vaio-drobo/arc.0.7

  This could happen in several circumstances:

  1. cache-size-limit is >= 0, do backups, set cache-size-limit to -1
  2. cache-size-limit is >= 0, add a new destination, do backup
  3. cache-size-limit is -1, arc files deleted manually

#909 - February 27, 2013 - beta expires June 15, 2013


* new backup -X option to cross mount points
* stats speed up
* --no-compress backup option removed
* bug fixes


- backup: a new option -X means to cross mount points, also called
  descending into other filesystems.  The default is not to cross
  mount points, as before.  Be careful, because -X does not
  discriminate between local filesystems and remote filesystems, so
  you can end up saving an entire NFS server for example.

- the stats command is much faster when there are millions of blocks.
  A backup with 11 million unique blocks was taking 5 minutes for
  stats, but now takes around 2.5 minutes.

- the --no-compress backup option was made obsolete when -Z0 was
  introduced in April 2012.  --no-compress has been removed in this

- the mount command would often give bad address errors when accessing
  files.  This bug was introduced a couple of weeks ago, in #871.

#902 - February 23, 2013 - beta expires June 15, 2013


* new destination keyword 'maxsize'
* new destination keyword 'randfail'
* new stats added, new stat -v option
* destination sync code is 50x faster
* 65% speedup for small archive files
* stats is 2x faster if many versions
* important Glacier bug fix: regions
* bug fixes


- a new keyword for destinations, 'maxsize', can be used to limit the
  size of files uploaded to a destination.  For example, many imap
  servers have small limits like 25MB per file.  While the
  arc-size-limit config keyword can be used to limit the size of
  archive files, there is no way to limit the size of the hb.db.n
  files.  Now, using maxsize, a destination can impose a hard size
  limit.  If a file to be uploaded is larger than maxsize, hb will
  split the file into pieces, each maxsize in length (except the last
  piece), and upload the pieces separately.  Pieces have a .pN suffix
  added.  When the file is retrieved from the destination, each piece
  is retrieved separately and the original file is reassembled.

  Maxsize should be at least 1MB larger than arc-size-limit.  Arc
  files can exceed arc-size-limit by up to 1 block (backup's -B
  parameter).  If you use -B4M for 4MB blocks, Maxsize should be 4MB
  larger than arc-size-limit.  If maxsize is equal to arc-size-limit,
  things will still work, but an archive slightly over arc-size-limit
  will have to be split on upload.  This causes more I/O during
  upload, creates more files on the remote, and the 2nd piece is less
  than a block size, which is inefficient.

  Remote error recovery has been rewritten so that each piece will be
  retried according to the destination's retry settings.  But if one
  piece exceeds the error retries, the entire file will have to be

- a new destination keyword, randfail, can be used to simulate remote
  failures.  The value is an integer 0-100 representing the percentage
  of requests that should fail.  So 25 means 1 out of 4 requests will
  fail, 50 means 1 of 2 will fail, 75 means 3 of 4 will fail, 100
  means every request will fail.  Simulated failures do not generate
  any remote traffic.  Destination threads will stop when all requests
  fail for one file.  Randfail is for testing hb's error recover and
  of course should not be used in normal operation (DUH!)

- some new statistics have been added to the stats command,
  specifically the "industry standard dedup ratio" used by many other
  backup programs.  This ratio is computed as sum(bytesin) /
  sum(bytesout) and assumes that every backup was a full backup.  I
  didn't say it made sense...  Right now, HB has to compute these
  figures and it takes a while, but soon they will be recorded by the
  backup program so the stats command will not have to do so much
  work.  It takes around 5 minutes for 700K files, so could take quite
  a while if there are many millions of files in the backup.

- a new option -v has been added to the stats command.  After each
  line of statistics, a paragraph describing (to the best of my
  ability!) how the statistic is generated, what it means, and why it
  might be useful.

- in testing with 10,000 archive files, the backup sync procedure was
  taking 50 seconds to figure out which archives needed to be copied
  to remotes, even if none did.  Now it takes less than 1 second.
  This was a potential scalability issue for sites using smaller
  archive files (confir variable arc-size-limit).

- archive files default to 1GB (config variable arc-size-limit), but
  some sites want to use smaller archives, either because their
  storage system requires it or because it allows HB to manage deleted
  files better.  With smaller archives, HB can delete entire archives
  more often when rm or retain delete files from the backup, rather
  than going through a download, pack, upload cycle.  But smaller
  archives also have more overhead and can slow the backup down.
  Changes in this release reduce this overhead.  A test backup that
  creates 5000 small arc files is now 65% faster: 40 seconds vs 114

- the stats command is 2x faster if a backup has many versions (mine
  has 600)

- with Amazon Glacier, the dest.conf has a location keyword to specify
  the region for Glacier storage.  This was not working, and all
  Glacier transfers were going to us-east-1, the default Glacier
  region.  Now, the location keyword is honored, and the region is
  also checked to make sure it's a valid Glacier region; currently
  there are fewer Glacier regions than for other AWS services.

  Related to this change, HB creates an associated S3 bucket to be
  used with Glacier.  This bucket contains the database files with
  backup metadata.  The actual backup data - the bulk of data - is
  stored in Glacier.  This S3 bucket was named:
  But, if your Glacier data is getting stored in us-west-1, you
  probably want your associated S3 bucket stored there too.  So now,
  the associated S3 bucket name for regions other than us-east-1 is:

- if a backup contained only empty directories, the new stats command
  would fail with a traceback

- if only dest.db or hb.db.n were being recovered from Glacier,
  recover had an unnecessary sleep after the message:
      Download size: 0 Files: 0

#877 - February 14, 2013 - beta expires June 15, 2013


* exclude directory if tag file is present
* new stats command to display backup statistics
* backup display stats with -v0


- a new config option, no-backup-tag, can be set to a list of
  filenames.  If a directory contains any of these files, the
  directory contents are not backed up.  The directory itself and the
  tag file are the only items backed up.  For example:

    $ hb config -c backupdir -no-backup-tag .nobackup,CACHEDIR.TAG

- a new command, stats, displays statistics about a backup.  More
  stats will be added in the future, specifically about dedup ratios.

- the backup command now displays statistics after the backup, even
  with -v0.  Some sites with huge backups - millions of files - used
  -v0 to prevent displaying any pathnames, but this also suppressed

#871 - February 11, 2013 - beta expires June 15, 2013


* new shell destination type
* hb runs better on hardened kernels
* less space is used on remotes for hb.db.n files
* bug fix: hb dest command was not recognized


- a new destination type, shell, allows customized programs and
  scripts to transfer files to and from remote destinations on behalf
  of hb. in the doc subdirectory is an example shell
  implementation of the built-in Dir destination.  Excluding comments
  and blank lines, this script has only 26 Python statements: shell
  destinations can be quite easy to write.  There is a limitation that
  no state information generated by the remote, such as an object id,
  can be stored.

- on hardened Linux kernels (gsecurity), hb required paxctl -m or
  paxctl-ng -m to allow anonymous mmap with execute privilege.  This
  was only needed for the mount command, to load ctypes and libfuse
  libraries.  But all hb commands would fail with an error in the logs
  like:  denied RWX mmap of <anonymous mapping>, and a traceback like:

    $ hb init -c vinci
    Traceback (most recent call last):
      File "/", line 20, in <module>
      File "/", line 13, in <module>
      File "/", line 19, in <module>
      File "/", line 35, in <module>
      File "/", line 90, in <module>
      File "/", line 19, in <module>
      File "/", line 75, in <module>
      File "/", line 17, in <module>
    ValueError: bad marshal data (unknown type code)

  Now, only the mount command will fail on hardened kernels; all other
  hb commands will work normally, even without marking the hb binary.
  To use the mount command on hardened kernels, use paxctl-ng -m hb to
  allow anonymous mmap w/execute.

- after each backup, rm, or retain, hb.db.n is sent to remote
  destinations.  There is a trade-off between how much data is sent in
  each hb.db.n vs the total size of all hb.db.n files stored on the
  remote.  The smaller each hb.db.n file, the less data transmitted
  after each backup, but the larger the total size of all hb.db.n
  files on the remote.  In this release, hb.db.n files are slightly
  bigger but there will be fewer stored on the remote.  In tests, this
  saves significant space on remote destinations.

- the new dest command was not working in the release build of hb

#862 - January 21, 2013 - beta expires March 15, 2013


* new command "dest" to help secure dest.conf
* add progress meter when restoring large files
* new backup options --maxwait and --maxtime control backup window
* warning instead of error when backing up mounted block devices
* options accept decimal points: -D1.5g for example
* bug fixes


- when backups are sent to remote servers, the credentials for these
  servers - userids, passwords, access keys, etc. - are stored in
  dest.conf, unencrypted.  A new command, dest, can be used to load
  this file into the encrypted backup database; then it can be deleted
  and hb will read the creditials from the database.  Examples:

  * hb dest load  - load dest.conf into the database
  * hb dest show  - show dest.conf from the database
  * hb dest erase - erase dest.conf from the database

  The dest command requires that admin-passphrase be set.  Otherwise,
  anyone could use dest show to display the stored credentials in
  dest.conf.  It is possible but not recommended to remove the
  admin-passphrase after loading dest.conf into the database.

- get: add percent progress meter when restoring files 500MB or more

- backup: new option --maxwait <time> specifies the maximum time to
  wait for archives to upload to all destinations after the backup
  completes.  The time can be specified as a number (seconds), 6h, 1d,
  etc.  This is useful in these situations, maybe others:

  * for initial backups, the backup runs faster than most destinations
    can accept data.  A backup that takes only a few hours to create
    may take days or even weeks to upload over an Internet connection.
    But you may only want your connection used at night, and you still
    want daily incrementals even if the initial upload has not
    finished.  Starting the backup at midnight with --maxwait 6h is a
    way to handle this.

  * a new destination is added to a large existing backup.  It may
    take days to get all of the existing backup data transmitted to
    the new destination, which is okay, but you don't want to lock the
    backup directory for the entire time as this prevents future
    backups until the copy to the new destination has finished

  IMPORTANT: be careful with --maxwait: if it is too short, your
  backup may never get fully synched to remote destinations and the
  remote data would always be incomplete.

- backup: new option --maxtime <time> specifies the maximum time to
  spend actually saving files.  When this time is exceeded, the backup
  stops and waits for uploads to finish using --maxwait, which is
  adjusted based on how long the backup took.  Examples:

    --maxtime 1h means to backup for up to 1 hour then wait the
      remainder of the hour to upload the data, ie, total time is
      limited to 1 hour

    --maxwait 1h means to backup everything requested, but only wait 1
      hour for uploads to finish

    --maxtime 1h --maxwait 1h means to backup for up to 1 hour, then
      wait 1 hour + the remaining backup time for uploads to finish,
      ie, the total time is limited to 2 hours

    --maxtime 1h --maxwait 1y means to backup for 1 hour, then wait a
      year for uploads to finish, ie, only the backup time is limited
  This is useful for the initial and full backups, which usually take
  much longer than incremental backups, and allows you to spread the
  full backups over many days.  It also prevents incrementals from
  running into production time when a large amount of data changes for
  some reason.

  IMPORTANT: be careful with --maxtime: if it is shorter than the time
  required for your average incremental backup and upload, your backup
  may never finish, some files may never get backed up, and/or your
  remotes may never be fully uploaded.  Backup should start where it
  left off when maxtime is set, but it doesn't do that yet.

- backup: instead of giving an error when mounted block devices are
  backed up, backup will only display a warning.  If a partition is
  mounted readonly for example, it's fine to back it up even if
  mounted.  Read/write partitions should not be backed up as they will
  be inconsistent when restored.  You may still get an error on some
  OS's, for example, on OSX:

    Warning: backing up mounted block device: /dev/disk4s1
    This is backup version: 0
    Unable to open file: Resource busy: /dev/disk4s1

- many options that only accepted integers now accept decimal points,
  for example, backup option -D1.5g, config option arc-size-limit 1.5g

- clear: when a Dir destination is setup in dest.conf, files are
  symlinked to the backup directory instead of copied.  The clear
  command deletes the destination files first, then the backup
  directory files.  But the symlinks were pointing to non-existing
  files, so clear would not remove them.  Then, backup would try to
  sync these dangling files, causing errors like:

  Exception: dest: can't access file for put: /Users/jim/hbdev/hb/hb.db.25

- backup: hb has special handling for huge directories to reduce
  memory usage.  (Use ls -ld to see the directory size.)  But if a
  huge directory was actually empty, it would cause an error:

    Traceback (most recent call last):
      File "", line 2092, in <module>
      File "", line 1953, in main
      File "", line 1569, in backupobj

- backup: in some environments, backup would halt with "database is
  locked" errors.  This seemed more likely when multiple workers were
  used on destinations, and since the default is now 3 workers, this
  has become more of a problem.  This error is sometimes hard to
  reproduce (that means I couldn't reproduce it).  Changes have been
  made in this release that should fix this.

- backup: if the initial backup ran long enough to create 1 archive
  file then was interrupted, mounting the backup would not work.
  Another backup of even of a single file that ran to completion would
  fix the problem.

- backup: backup files without dest.conf, create dest.conf, repeat the
  same backup immediately; no files are modified, but hb.db.0 still
  needs to be transmitted and wasn't.  This is an unlikely bug because
  any non-null backup would fix the problem, but it confused me so
  it's likely to confuse others.

#847 - January 15, 2013 - beta expires March 15, 2013


* add paced retrievals in recover, for Amazon Glacier
* Glacier bug fix when local archives are missing
* add "args" keyword to ssh destination
* accept quoted arguments in rsync "args" keyword
* minor selftest bug fixes
* backup sometimes used too much memory


- Amazon Glacier: recover now has 4 options for recovering your local
  backup from Amazon Glacier:
  * Option 1: recover all files within 4 hours 

    This option is --dl now.  It's usually the most expensive option,
    though if there is only 1 archive to retrieve, this is the only
    way.  Today hb will not segment one archive to retrieve it over a
    longer period with ranged retrievals.  Because of Glacier's
    unusual pricing, it's best not to use large archives (over 1GB).

  * Option 2: retrieve over N hours or days
    This option is --dl 8h for hours, or --dl 8d for days.  Using these
    allows pacing the retrieval over the time period specified, with
    files evenly divided into 4-hour download groups.

  * Option 3: retrieve using specified bandwidth
    This option specifies the average bandwidth to use for the
    retrieval, for example, --dl 1MB would be 1 MByte/sec, --dl 1Mb
    would be 1 Mbit/sec.  You can also use G and K suffixes.  The
    actual downloads are not throttled to this download rate; the
    bandwidth number is only used to decide how much data to retrieve
    in each 4-hour download block to give this average rate.

  * Option 4: retrieve based on archive sizes
    This option, --dl cheap, minimizes your peak retrieval rate and is
    usually the least expensive option.  The only time it will not be
    least expensive is if the retrieval crosses from one month into
    the next.  In that case, retrieval costs can double and it may be
    better to keep the retrieval all in the current month using option
    2 or 3.  This is the default option.
  Using one of these options, recover will pace the retrieval of
  archives stored in Amazon Glacier in 4-hour download groups.
  Retrieval pricing in Glacier is complicated and can be expensive, so
  you should review it before doing a large Glacier retrieval.

- Amazon Glacier: when archives are cached locally, get, selftest, and
  mount expect archives to be in the backup directory.  If they aren't
  there for some reason (manually deleted) they will be fetched from a
  remote destination as needed.  None of this is new.

  When Amazon Glacier is the remote in this situation, new retrieval
  jobs would be started and get/selftest/mount would fail with error
  messages, because it takes 4 hours to get files from Glacier.  The
  bug here is that if you waited 4 hours and retried the same
  operation, it should have fetched the files from Glacier.  Instead,
  it was starting new retrieval jobs.  This has been fixed, though
  having to run programs twice is not ideal.

- added an args keyword to the ssh destination.  This allows
  specifying sftp options (use man ssh_config to see the options).
  For example:

    args -oIdentityFile=~/.ssh/otherid -oLogLevel=debug

  will use an alternate private key file and enable extra debug
  messages.  You can use quotes if necessary.

- rsync: added Port example in dest.conf.rsync example.  rsync Args
aaaaaaaaaa  keyword now parses quotes like a shell would, allowing quoted
  strings.  For example, to use an alternate ssh port and userid:
    Args -e "ssh -p 8002 -l sshuser"

- selftest: if a raw block device backup was interrupted after it had
  written at least 1 archive file, selftest would display an error:
    Error: for logid 3, hash should not be null: /dev/disk4 [r0]

- selftest: if a raw block device was backed up and cache-size-limit
  was set (not all archives are local), selftest would display a
  warning message (the numbers will be different):
    Oops: planned read count 7 != actual read count 0; diff = 7

- selftest: was not running its normal file data integrity checks on
  raw block device backups

- backup: sometimes when doing a multi-thread backup of a very large
  file with a small block size (as with VM images), backup could use
  excessive memory.  This was not related to dedup, but was more an
  issue with how the OS scheduled threads and is similar to the
  selftest problem fixed in #800.  It seemed to occur more often on
  Linux, but could have happened anywhere.  Backup's memory usage is
  now much more controlled and stable.

#837 - December 26, 2012 - beta expires March 15, 2013

- Amazon Glacier support is available with a new destination type,
  glac.  Glacier is an inexpensive (1 cent/GB/mo) archive system that
  works well for backups, providing you don't need to restore very
  often.  Since HashBackup supports multiple destinations, Glacier can
  be used as cheap offsite insurance for an onsite backup.

  Example dest.conf entry for Glacier:

    destname myglac
    type glac
    accesskey <amazon access key>
    secretkey <amazon secret key>
    vault hbvault
    dir server1

  The accesskey and secretkey are the same as used by other Amazon Web
  Services such as S3.  Unlike S3 buckets, the vault name does not
  have to be globally unique: it only has to be unique for your
  account.  Using the Dir keyword, it is possible to store multiple
  backups in a single vault.

  Glacier has some unusual trade-offs for cheap storage:
  * retrieval time is around 4 hours
  * there are retrieval costs in addition to bandwidth charges
  * you get a free allowance for small retrievals
  * you have to spread large retrievals over 600 days to be 100% free
  * fast retrievals can be very expensive
  * the option to ship data on a disk is available, but you still pay
    retrieval fees plus the fee for shipping the disk

  Glacier retrievals occur in 2 stages:
  * first, request a file; then wait 4 hours
  * second, download the file from Glacier
  * retrieved files are available for about a day

  To handle this situation, HashBackup stores its databases in a
  helper S3 bucket named <accesskey>-hashbackup-glac-vaults that is
  created the first time you use Glacier.

  NOTE: your AWS access key is not a secret: it is sent on every AWS
  request and is basically your AWS userid.

  Backups to Glacier look just like backups to any other destination.
  But hb recover must be done in 2 steps:

  1a. first, databases are downloaded immediately from S3
  1b. retrieval jobs are created for archive files, messages are:
      Started retrieval job <long job id> status InProgress for arc.0.0

  After this, you must wait around 4 hours for Glacier to make your
  archive files available.  Then run recover again, and assuming all
  files are available, your archives will be downloaded to the local
  backup directory.

  If hb recover is run before all files are retrieved, there will be
  messages like:
    Queueing arc.0.0 from glac
    Retrieval job <long job id> status InProgress for arc.0.0
  If this happens, you must wait longer and run recover again.
  Eventually, you will get all your archive files back.

  Glacier works best as a "backup of last resort", where you have
  another copy of the backup locally.  It can be used as the only copy
  of the backup (if cache-size-limit is set), but large retrievals are
  expensive and may be difficult to manage.

  If you make a backup to Glacier then decide you don't want it, it's
  important to hb clear the backup before deleting the backup
  directory.  The reason is that, unlike S3, Glacier will hang on to
  your files, even if you make a new backup with the same vault and
  dir keywords.

- recover: handles S3 files that have migrated to Glacier by
  requesting a restore and holding restored files in S3 for 5 days.
  Because each Glacier restore takes 4 hours, recover has to be run
  several times:

  1. first, dest.db is requested (4 hours)
  2. then hb.db.n files are requested to recreate hb.db (4 hours)
  3. then arc.v.n files are requested (4 hours)
     NOTE: this step is omitted when cache-size-limit is set
  4. then the final recover can fetch archives from S3

  Each recover run except the last will display messages for files
  that are still transitioning from Glacier back to S3, for example:

    s3(mydest): file is being restored from Glacier to S3: hb.db.0

  When hb tries to access these files that haven't yet been
  downloaded, an error will occur:

      Loading /Users/jim/hbdev/hb/hb.db.0
    Traceback (most recent call last):
      File "", line 269, in <module>
      File "", line 218, in main
      File "/Users/jim/hbdev/", line 486, in get
      File "/Users/jim/hbdev/", line 895, in applyincdb
    IOError: [Errno 2] No such file or directory: '/Users/jim/hbdev/hb/hb.db.0'

  This isn't very elegant and might be improved in the future, but it
  has worked so far in testing.

- recover: if configured in dest.conf (Workers keyword, default is 3),
  use multiple workers to download files simultaneously, reducing
  recovery time

- validate s3-compatible bucket names before creating new buckets:
  must be 3-63 characters, start and end with a-z or 0-9, and may
  contain dashes.  Existing buckets with names violating these more
  strict rules are still accessible.

- with multiple workers there was a race condition in s3-compatible
  destinations when a bucket was created: several workers could try to
  create the bucket at once, causing errors.  This is fixed.

#828 - December 22, 2012 - beta expires March 15, 2013

- S3-compatible destinations now store an MD5 checksum in the database
  when a file is transmitted, and compare this to the server-generated
  MD5 checksum when a file is retrieved.

- S3-compatible destinations have a new Dir keyword, so that many
  backups can be stored in one bucket, and backup data can be
  segregated from other data in a bucket.

- Rackspace Cloud Files destinations have a new Dir keyword, so that
  many backups can be stored in one container, and backup data can be
  segregated from other data in a container.

- there was a bug in the previous db upgrade process with old archive
  files.  The error during upgrade was:
    Unable to upgrade your database to rev 13: write() takes exactly 5
    arguments (4 given)

- for longtime beta testers using dest.conf, an old hb.db file could
  be stored on a remote.  In the Dec 3rd release, the sync code was
  rewritten.  This new code would see the old hb.db on the remote,
  delete it, which is correct, but also delete the local copy, which
  is not correct.  This bug has been fixed.  The workaround for the
  lost hb.db is to rename hb.db.orig to hb.db and run the upgrade
  again, or use recover to rebuild hb.db from the remote copy

- when generating keys, get as many bytes as possible from
  /dev/random, without blocking; get the rest from /dev/urandom

- OSX Snow Leopard and Lion have a bug where an unaligned disk write
  that would fill a disk does not return an error.  Instead, it does a
  partial write and doesn't throw an error.  In testing, this OSX bug
  could corrupt the dedup table.  Code was added to detect this

- if a disk full condition occurred at a particular point during
  backup, it could cause selftest errors later because of the
  partially saved file.  Now if a critical error occurs during backup,
  backup will display a "Fatal error:" message and stop immediately.

- related to the disk full problem, a new selftest option, --fix, has
  been added.  With this option, selftest will make corrections for
  simple errors and will not stop at 100 errors as it usually does.
  Corrections occur with -v2 (or higher).

#814 - December 14, 2012 - beta expires March 15, 2013

- added support for DreamObjects (destination type is do), an
  S3-compatible service by Dreamhost with good prices: 7 cents per GB
  for both storage and outgoing bandwidth.  For more information, see:

- selftest -v4 could fail with this error:
      Traceback (most recent call last):
    File "/", line 140, in <module>
    File "/", line 674, in main
    File "/", line 199, in checkallblocks
      UnboundLocalError: local variable 'row' referenced before assignment

#811 - December 13, 2012 - beta expires March 15, 2013

- [#797] selftest: added --debug option to print logid, pathname, and version
  for files as selftest runs

- [#798] init was failing if backup directory didn't already exist

- [#799] error messages are clearer when recover is used with the wrong key;
  removed temporary files created so that if recover is used again, it
  displays the same error messages

- [#800] on 32-bit Ubuntu Linux 12.04.1 LTS, with a 128GB file of random
  data, selftest would continually consume memory until it failed with
  an error like this:
     Error: Error reading archive: , logid 5, blockid 2050709:
        /mnt/snapshot/data/com bo.bin [r0]
     Exception in thread decrompq_loop:
      File "/", line 60, in start_thread
      File "/", line 97, in decrompq_loop
  or it was killed by the OS OOM (Out Of Memory) handler.

- [#801] mount: on Linux, du was always reporting zero for directory
  sizes.  Also changed top-level directory names from just version
  number to YYYY-MM-DD-HHMM-rV, to be more accessible.  The latest
  backup is now called 'latest' instead of 'c'.  To cd using just
  version numbers, use cd *r5 for version 5.

- [#802] mount: the Linux stat command would sometimes print different
  values for st_blocks on a real file and an hb mount containing the
  same file, with the difference being less than 8.

- [#803] compare was displaying socket files as if they were new,
  because backup never saves socket files.  Now compare ignores socket
  files, like backup.

- [#808] after a database upgrade by the 7xx and 8xx series of
  releases, version 6xx of hb was still able to access the backup
  database.  Now, after #808 or later is used, the correct error
  message is displayed when an old version of hb is used on an
  upgraded database: you need a newer version of HashBackup to access
  this backup.

- [#809] /dev/urandom is now used instead of /dev/random for key
  generation.  These are only different on Linux.  Recent versions of
  Linux running on a single-user VM frequently do not have enough
  entropy in the random pool to generate even a 128-byte key, so hb
  init can block for a very long time.  With this switch to
  /dev/urandom, hb init will not stall.

- [#810] update dest.conf README file to explain why changing destname
  after files are backed up is a bad idea and causes breakage

- [#811] backup: on Linux, filesystems that don't support flags, eg,
  FUSE, would sometimes return garbage flags.  If the nodump bit was
  set in the bogus flags, the file would not be backed up.  This has
  been fixed.

#796 - December 3, 2012 - beta expires March 15, 2013

IMPORTANT: all previous releases had a major imap bug if rm or retain
was used.  Please see the bug section if you have been using imap.


* database upgrade to dbrev 13 (may take a while)
* config command revised
* new 'cache-size-limit' config option for remote-only archives
* new 'workers' keyword for multiple uploads and downloads
* new 'retries' keyword to control destination retries
* prefetch remote archives during restores & selftest
* direct reads (no copying) from Dir destinations
* database upgrades revised for better compatibility
* non-incremental backups with --full backup option
* new 'audit-commands' config option saves command history
* ask for passphrase twice and verify they match
* Amazon S3 performance & reliability enhancements
* new 'rate' keyword to limit outgoing / upload bandwidth
* new 'pack-percent-free' config option
* new 'pack-remote-archives' config option
* removed 8GB limit on config option 'arc-size-limit'
* any config option change displays both old and new values
* improved selftest -v2/3/4 performance
* re-add 'userid' keyword to ssh destination
* imap destinations handle errors better
* minor change to inex.conf exclude handling
* bug fixes


- NOTE: this rev will do an automatic database upgrade to dbrev 13
  when any HB command is used, to support remote-only archives and a
  limited archive cache.  All archive files must be read for this
  upgrade, so the upgrade could take some time to complete

  IMPORTANT: do not enable any new features in this release until
  after your database has been upgraded.  For example, don't add the
  new 'rate' keyword to any destinations until after your next backup
  using this new release.

- HB versions all config changes so that config -rN displays the
  configuration settings of backup N.  With the old config setup, if
  an option is changed, a backup is taken, and that backup is later
  removed with rm -rN, the config option change also disappeared.
  Sometimes this is okay / desired, but usually it is unexpected,
  especially with options like admin-passphrase: users expect that if
  they put an admin-passphrase on a backup, it stays there.  A
  separate problem is that if backup N caused a database upgrade, then
  all backups N and later were removed with rm, the next operation
  would want to do a database upgrade; this failed because the upgrade
  had already occurred.

  To fix these issues, HB now has one config that is "current".  After
  each backup, the current config is saved.  But if backup versions
  are removed, it doesn't affect the current config settings.  Also,
  --revert was removed from config as it was seldom used in practice.

- HB supports remote-only archives using the new cache-size-limit
  config option.  This option defaults to -1, meaning there is no
  local cache limit and archive files will stay in the local backup
  directory as before.

  Keeping a local copy of all archive files and leaving
  cache-size-limit set to -1 has several benefits:

  - there is a redundant copy in case something happens to the remote

  - restoring from a local backup directory is much faster than
    restoring from a remote, where archives have to be downloaded

  - disk space is cheap, and most systems have room for a local copy
    of the backup.  Setting a cache size limit will save space during
    backups, but you may still need lots of local disk space during
    restore to download required archives

  - with local archives, backup never stalls waiting for remotes to
    accept archives

  - the backup directory does not have to be locked for read-only
    programs, such as mount, allowing backups to run while mount is
    running.  When the cache is limited, only one program can access
    the backup directory because archives might be coming and going

  - errors on a remote, such as it being down or having a full disk,
    do not affect a backup with local archives: the archives will be
    sent the next time the remote is available.  With a limited cache,
    the backup will halt if the cache becomes full and any remote
    stops accepting data

  But, there are environments where keeping a complete local copy of
  archives may not make sense:

  - hb is not the only backup
  - hb is being used as an archive tool
  - local disk space is extremely scarce, as on a small VPS
  - the "remote" backup target is really on a local network
  - your offsite storage provider will ship disks
  - raw partition backups

  Cache-size-limit is useful for these situations.

  If cache-size-limit is set >= 0 with hb config, the backup program
  may remove local archives after they have been transmitted to all
  remotes, to stay under the cache size limit.  Cache-size-limit zero
  means "no local archives".  This causes backup to stall after each
  archive until it is transmitted to all remotes.  A better option is
  to set cache-size-limit to 1-1000.  These small numbers mean
  "multiply by the max archive size".  So cache-size-limit 5 with the
  default arc-size-limit of 1gb means that 5GB of archive data can be
  kept locally after it is sent to remotes.  A specific size can be
  set too, for example, cache-size-limit 10gb.  There is no minimum
  cache size: HB will function correctly no matter what size is used,
  though it may need to temporarily go beyond the limit while
  executing a command.

  IMPORTANT: cache-size-limit is ignored and a warning is displayed if
  there are no remote destinations configured.

  If a new destination is added to dest.conf and some archives are
  only stored remotely, hb has to download these archives first and
  then upload them to the new destination(s).  In this release, this
  remote to remote synchronizing blocks other operations and backup
  waits until all destinations are synchronized before starting the
  backup.  With a local backup (no cache-size-limit), this doesn't
  happen because downloads (remote to local) aren't necessary.

  Another effect of a limited local cache is that backups may be
  delayed if there are slow remotes.  For example, if the
  cache-size-limit is 2GB and you have 16GB to backup, the backup
  program may have to delay while archives are uploaded.  This doesn't
  happen with a local backup (no cache-size-limit).

  One guideline for setting cache-size-limit is to use at least the
  average size of your typical backup.  This will allow backup to
  finish without waiting for remotes to accept data.

- a new dest.conf keyword, 'workers', can be added to any destination.
  This is the number of concurrent connections to a destination.  The
  default is 3, allowing uploading or downloading 3 files at once.
  Set workers to 1 if you want to minimize the impact of hb on your
  network connection.  (You can also use the new 'rate' keyword to
  limit the network impact.)

- a new dest.conf keyword, 'retries', can be added to any destination.
  If omitted, destinations will retry 2 times on errors (3 times
  altogether), delay 5 seconds the first time, then multiply the delay
  by 2 for each retry.  This is equivalent to retry 2, 5, 2.  Up to 3
  integers can be used with the retry command, with defaults used for
  missing values.  So retry 1 means to do only 1 retry with a 5 second

- get: creates a cache plan when cache-size-limit is set, and
  downloads archives in the background in the order they are needed.
  While get is running, the archive cache might exceed its limit.
  This may be necessary to avoid downloading the same archive more
  than once.  After get finishes, it will trim the cache back to

- selftest: creates a cache plan when cache-size-limit is set, like
  get.  Selftest's -v option controls the level of testing.  If
  cache-size-limit is set, selftest defaults to -v2 to prevent all
  archives from being downloaded.  If -v3 or higher is requested with
  cache-size-limit set, selftest will show how many archives need to
  be downloaded, how much space will be needed, and then ask for
  confirmation.  If cache-size-limit is not set (the default, meaning
  all archives are kept locally too), selftest uses -v9 as before.

- when fetching files from a Dir destination, a symlink is created
  instead of copying the archive to the cache.  This allows reading
  directly from the Dir destination file.  This is useful with mounted
  remote storage such as Google Drive, WebDAV, etc., because they
  support reads without needing to download an entire archive.  For

  - your backup directory is /hb, on the local filesystem
  - setup dest.conf with a Dir destination for remote mounted space
  - use hb config cache-size-limit 5 for a small local cache
  - your hb.db file is in fast, local storage in /hb
  - your archives are on slower remote mounted space
  - restoring a file will access remote storage directly

- The database upgrade process was rewritten for this release.  The
  old upgrade system worked well for a single-rev upgrade, but
  sometimes failed on older databases.  This release can update
  backups created with #339 (Oct 2010) or later, and is more reliable
  for future upgrades.

- backup: a new option, --full, forces a full backup.  This adds
  redundancy to the backup and can make restores faster too by
  reducing fragmentation in the backup.  When cache-size-limit is set,
  reducing fragmentation will usually reduce restore times by reducing
  the number of remote archives that need to be downloaded.  -D can
  still be used with --full to enable dedup, but no data from previous
  backups is reused.

  Another way to achieve backup redundancy vs -full is to simply start
  a new backup directory.  The advantage here is that everything is
  redundant, including the backup database.  The disadvantage is that
  it is harder to manage and configure, because each backup directory
  has to have unique remote destination directories, and retain cannot
  be used across multiple backups.

- a new config option, audit-commands, enables audit logging for the
  commands listed, or if 'all' is used, auditing is enabled for all
  commands.  To display the audit log, use the new hb audit command.
  Audit logs cannot be removed from the database.  For more secure
  audit logging, admin-passphrase should be set and disable-commands
  config; this prevents someone from turning off audit logging without
  the admin passphrase.  Example audit log:

    [jim@mb hbdev]$ hb audit -c hb
    Backup directory: /Users/jim/hbdev/hb
    Showing recent history

    Started: Mon 2012-11-19 15:40:26
    Build: 764
    Uid: 501 (jim)
    Gid: 20 (staff)
    Working dir: /Users/jim/hbdev
    Command: backup -c hb doc
    Finished: Mon 2012-11-19 15:40:27
    Exitcode: 0

    Started: Mon 2012-11-19 15:40:32
    Build: 764
    Uid: 501 (jim)
    Gid: 20 (staff)
    Working dir: /Users/jim/hbdev
    Command: ls -c hb
    Finished: Mon 2012-11-19 15:40:32
    Exitcode: 0

- rekey now asks for a new passphrase twice, and verifies that they
  are the same.  Before, a typo in the passphrase could make the
  backup database inaccessible.  To abort a rekey, enter mismatched

- Amazon S3: performance may be increased for large S3 transfers
  because of better use of Amazon's load balancing, and S3 error
  recovery may be better when an Amazon S3 server is having issues.

- a new destination keyword, 'rate', allows limiting upload bandwidth
  for for Amazon S3, ftp, imap, dir, Rackspace Cloud Files, and rsync
  destinations.  The value is the outgoing transfer limit in bytes per
  second, for example, rate 100k would mean 102400 bytes/sec.  This is
  the upload limit for each worker on a destination, and since the
  default is to have 3 workers, the aggregate limit would be 300k for
  this example if all the workers were busy.  If rate limiting is
  used, you may want to add the worker keyword with a value of 1, to
  limit bandwidth further.  A rate limit less than 1024 raises an
  error - it's probably a typo.

- after removing files from the backup, either with the rm or retain
  commands, some archives may have empty space.  If there is enough
  empty space (25%), HB would compress archives to save local disk
  space.  If any archives shrank 50%, hb would retransmit them to
  remote destinations to free up remote disk space.  This works well
  when archives are local.  But when archives are only stored remotely
  because cache-size-limit is set, this packing operation requires a
  download first.

  So a new config option has been added, pack-percent-free.  This
  takes a number, which is a percent.  If an archive has this much
  free space or more, it will be packed to save disk space.  The
  default is 50, so archives are packed when 50% or more space is
  free.  Another useful value is 100: archives are never packed, but
  are deleted when they are completely empty.  It may or may not be
  cost effective to set this, depending on whether your cache is
  limited and the rates you pay for outgoing bandwidth, incoming
  bandwidth, and storage costs.

  NOTE: old format archives created before Sep 2012 cannot be

- a new yes/no config option, 'pack-remote-archives', specifies
  whether to pack archives that are not stored locally.  The default
  is no.  This option exists because many cloud storage vendors charge
  for download bandwidth, so it may cost more to download an archive
  for packing than it is worth: it might be cheaper to just pay for
  the storage.  This option only makes sense when cache-size-limit is
  also set.

- the CRC on archive blocks was removed.  This CRC was intended to
  allow remotes to do some simple archive validation, but that's not
  possible with the other changes in this release.  Each block still
  has a full, encrypted SHA1 hash for data verification during
  restores, but remotes can't access it: they don't have the key.  And
  each file still has a full SHA256 as a double check to verify
  restored files.

- mount: because there is no way to predict which archives might be
  accessed, the mount command cannot prefetch remote archives.  This
  may lead to slow file data access while remote archives are
  downloaded.  The cache size limit is ignored while mount is running,
  to avoid downloading the same archive more than once.  When the HB
  backup filesystem is unmounted, the cache is trimmed back to

- in previous releases, read-only programs like mount did not lock the
  backup directory.  But when the cache size is limited, all programs
  must lock the backup directory since archives may be moving in and
  out of the cache.

- the config option arc-size-limit sets the maximum size of archive
  files created by backups.  Previously this was limited to 8gb; now
  there is no upper limit.  The default is still 1gb.  To facilitate
  testing, the lower limit has been changed from 1mb to 10000 bytes,
  but small sizes like this should not be used for real backups.

- config: when a config option is changed, both the new and old value
  are displayed

- selftest -v2 -p0 (one core) at a customer site was 7x faster than -v2
  -p4 (4 cores).  This performance issue, which also affected -v3 and
  -v4, has been fixed.

- multi-core selftest could report an error with a file, then
  "Verified x files with 0 errors", then at the end, "1 error"

- the Userid keyword was added back to the ssh destination.  This was
  accidentally removed when hb switched to using sftp.  Also, this
  destination always tried to create the target directory on the ssh
  host.  Now, it will only do this when files are sent to the host -
  not on a remove or fetch.

- before, the inex.conf rule ex /tmp/ meant the same thing as ex
  /tmp/*.  The problem is it also prevents requesting a backup of
  /tmp/jim; it saves the directory but not the contents, and it's very
  confusing: "Why won't it work!?"  Hey, if it confuses me, which it
  did, it's confusing!  Now, ex /tmp/ works as before when backing up
  /, and will not save the contents of /tmp.  But you can request a
  backup of /tmp/jim and it will work as expected.  To get the old
  behavior, use a rule ex /tmp/*.  Then when requesting a backup of
  /tmp/jim, the directory itself is saved since you explicitly
  requested it, but the contents are excluded by inex.conf.  Also, a
  new warning is displayed when a requested directory's contents are
  excluded by inex.conf.

- imap destinations would sometimes reuse a connection when an error
  occurred.  But sometimes these errors are fatal, for example, if the
  imap server resets a connection.  Now when an error occurs, the old
  connection is closed and a new connection is created.  hb retries 3
  times when a destination encounters problems, and imap retries up to
  20 minutes with an exponential backup.  So altogether, imap will
  retry for around an hour before giving up.

- imap performance is slightly improved by sending one less imap
  command per file sent, received, or removed

- IMPORTANT IMAP BUG FIX: in all previous versions of hb, rm and
  retain could delete too many archives on the imap server, rendering
  the remote backup incomplete.  The symptom of this problem is that
  when an archive arc.v.n is removed, all archives with arc.v.n as a
  prefix are removed from the imap server.  For example, if retain
  removes arc.0.3, arc.0.30 and arc.0.31 are also removed incorrectly.
  A similar problem occurred with hb.db.n files.  This only occurs on
  imap destinations.

  To correct this, the database upgrade procedure will tag files that
  need to be resent to imap destinations.  After the next backup
  completes, the remote imap destinations should have the missing

  To verify your remote imap backup is correct, wait until your next
  backup completes with this new version.  Then create a new temporary
  backup directory, copy your key.conf and dest.conf files there, and
  run hb recover -c tempdir.  This will download all of your remote
  backup files from your imap server.  Run hb selftest -c tempdir to
  verify there are no errors.  Then tempdir can be deleted.

  If you still have selftest errors because of missing archives, edit
  your real dest.conf file in your production -c backup directory and
  change the destname of your imap destination.  For example, if it
  now says destname imapjim, change it to imapjim2.  On your next
  backup, all backup files will be uploaded to the imap server.  Then
  repeat the selftest in the previous paragraph to verify your remote
  imap backup has been fixed.

- imap: would sometimes fail to parse server responses correctly if
  additional information was included, causing unnecessary retries.

- imap: added a debug keyword.  debug > 0 will cause imap FETCH
  responses to be dumped, which can help diagnose parse errors.  debug
  >= 4 will dump the entire imap conversation.

- Dedup statistics were inconsistent.  For example, backup might say:
      Dedup enabled, 28% of current, 7% of max
  But then retain or rm would say:
      Dedup enabled, 28% of current, 0% of max
  Rm and retain never expand the dedup table, so they don't have the
  -D option to specify a maximum dedup table size.  Without knowing a
  maximum size, it was assumed to be huge; so the 2nd stat was always
  0%.  To avoid confusion, only backup prints dedup statistics now.

- if hb.db was deleted, running backup or recrypt would create a new
  (empty) hb.db.  When this empty hb.db was sent to remotes, it would
  cause all hb.db.n files to be deleted.  Now, backup and recrypt will
  issue an error message that hb.db doesn't exist.  The usual remedy
  would be to run recover to regenerate hb.db.

- recover checks that hb.db recovered from a remote is bit identical
  to the original hb.db.  In a specific, unusual situation, recover
  would report that all signatures matched on each hb.db.n file, but
  then would report 'Database HMAC mismatch' on the final database and
  stop.  The recovered database was equivalent to the original, but it
  was not identical.  This bug has been fixed and the error message is
  now a warning; the database is available if you choose to use it.

- get: setuid, setgid, and the sticky bit were being saved but not
  restored on regular files.  This has been fixed.  These bits are
  also now restored if the numeric userid running get is the same as
  the numeric userid of the restored file.

- ls: setuid, setgid, and the sticky bit are displayed with ls -l

- when the backup directory is locked and can't be accessed, the error
  message displays the process id owning the lock

- get: if a directory was restored with --orig and the parent
  directory didn't exist, get would fail with an error message
  "ValueError: too many values to unpack"

- get: if two hard links to the same file were listed on the command
  line, the first file existed before the restore, and the restore
  replaced it, the restore of the 2nd hard link would fail with an
  error like:
     Unable to hardlink: No such file or directory:
  followed by two pathnames .../hb-540858.tmp -> .../hb-297485.tmp

- clear: if hb.db didn't exist, clear stopped with an error; it should
  have cleared the rest of the directory

- clear: would sometimes print the error message "database schema has

- versions: could fail with the error message KeyError: 'getpwuid():
  uid not found: nnn' if a backup from one system was transferred to
  another system with different userid -> name mappings

- add a test to all programs that the -c backup directory is actually
  a directory

- when reading passphrases, a warning was displayed and characters
  still echoed on the screen

- backup: when the dedup table was full, it was resized (to the same
  size) at the beginning of the backup

- large dedup tables could cause MemoryExceptions and other problems
  when a program started.  A 1GB dedup table was actually temporarily
  requiring 4GB of memory.  As a side effect of fixing this, loading a
  large dedup table is now much faster.

- mount, backup, selftest: if certain files are backed up with dedup
  enabled, fixed block sizes in one backup version, and variable block
  sizes in a later backup version, reading them via mount may cause a
  Bad address error.  This is extremely data dependent: on my backup
  of 800K files, this occurred with 4 files.  The cause of the problem
  is a bug in the backup program, and that has been fixed.  selftest
  has been changed to detect this problem (requires -v9).  The
  database upgrade will correct files with this problem.

#668 - September 16, 2012 - beta expires December 15, 2012


* validate destination keywords
* new rsync args keyword wasn't working


- bogus keywords in a destination (typos) were silently ignored.
  Unrecognized keywords now generate a fatal error.

- the new rsync Args keyword was not getting inserted into the rsync
  command line.

#664 - September 12, 2012 - beta expires December 15, 2012


* suppress echo when entering passwords
* add args keyword to rsync destinations to insert arguments
* enable/disable commands with config options & admin passphrase
* backup bug fix


- previously, passphrases were read from stdin and displayed during
  keyboard entry.  Now echoing is suppressed and passwords are read
  from /dev/tty.

- rsync destinations have an Args keyword, and any options listed here
  will be inserted into the rsync command line.  For example:
    Args --bwlimit=64

- three new config keywords were added:

  1. admin-passphrase: defaults to ''.  If set to something else, this
     passphrase has to be entered to view or change the config.

  2. disable-commands: (default '').  A comma-separated list of hb
     commands that require the admin passphrase.  For example:

     $ hb config -c hb disable-commands clear,recover,rekey,recrypt,retain,rm

  3. enable-commands: (default '').  A comma-separated list of hb
     commands that do NOT require the admin passphrase to be executed.
     All other commands will require the admin passphrase.  For

     $ hb config -c hb enable-commands backup,help,ls

  It is an error to set both enable-commands and disable-commands.  It
  is more secure to set enable-commands because if new commands are
  added to hb they will automatically be disabled.  It's not possible
  to disable upgrade because it is not associated with an hb database,
  and that's where the config information is stored.

- backup: if an hb.db.n file from a previous backup needed to be sent
  to a destination, an error could occur:
    Traceback (most recent call last):
      File "/", line 41, in <module>
      File "/", line 1835, in main
      File "/", line 607, in sync
    NameError: global name 'HBDB' is not defined

#656 - September 10, 2012 - beta expires December 15, 2012

- on Linux, if the random number entropy pool is low, reading from
  /dev/random is supposed to block.  This may return a short read,
  causing problems such as:

    Exception in thread mashq_loop: AES key must be either 16, 24, or 32
    bytes long
      File "/", line 60, in start_thread
      File "/", line 742, in mashq_loop

  Now, HB will stay in a loop until all random bytes are read and will
  print a message every 5 seconds that it is waiting for random data.

#655 - September 10, 2012 - beta expires December 15, 2012


* security is enhanced with a new encryption system
* key can be protected with a passphrase
* old backup data can be recrypted with a different key
* remote files are hardened against remote tampering
* multiple CPU cores are used in get and selftest
* raw block device, partition, and logical volume backups
* performance enhancements
* bug fixes


- NOTE: this rev will do an automatic database upgrade to dbrev 12
  when any HB command is used, to support the new encryption system

- security: HashBackup's encryption system is enhanced in this release
  to prevent a certain kind of information leak that could allow
  someone to determine if your backup contained data in common with
  their backup.  Existing backups remain accessible and future backups
  will use the new encryption system.

  NOTE: to exploit this, someone must have copies of your encrypted
  backup files and *unencrypted* copies of files to test in common.

- security: local and remote copies of dest.db are now encrypted.
  There is little value in hacking this file, which is why it wasn't
  encrypted before, but putting any data on a remote server
  unencrypted is not a good practice.

- security: remote copies of dest.db now have an HMAC (Hashed Message
  Authentication Code) signature.  This signature is generated on
  upload and verified on recovery.

  NOTE: an HMAC signature is similar to a regular hash, like SHA1,
  MD5, etc., but combined with a secret key.  Without knowing the
  key, the HMAC cannot be regenerated by an outsider, whereas other
  hashes can.  HMACs provide a stronger guarantee against tampering
  than regular hashes.

- security: previously, database files were encrypted with AES-128 and
  arc files used AES-256.  Now, AES-128 is used everywhere, because:

  -- this was advised after a security review of HashBackup by a
     well-known, published security expert: AES-256 has key schedule
     weaknesses that might make it less secure than AES-128.

  -- although all variants of AES have demonstrated weaknesses, no
     practical attack is known

  -- if you have data that is so valuable that AES-128 is not enough
     protection, it's more likely that the data will be obtained
     through other means such as stealing your computer and/or
     physical coercion rather than obtaining your backup and breaking
     the encryption

  -- a quote from the 2nd reference link below:
     "Recovering a key is no five minute job ... the number of steps
     required to crack AES-128 is an 8 followed by 37 zeroes.
     'To put this into perspective: on a trillion machines, that each
     could test a billion keys per second, it would take more than two
     billion years to recover an AES-128 key,' the Leuven University
     researcher added."

  -- References:

  The encryption change is implemented in a compatible way so that
  existing backups are still accessible.

- init: normally init generates a random key automatically, and stores
  the key in the key.conf file.  But in some situations, others may
  have access to the key.conf file.  Examples are hosted virtual
  private servers, managed servers where someone has root access, and
  Google Drive and other "remote drive" services.

  A new -p option has been added to init and rekey to protect the key
  with a passphrase.  hb init -p ask will get the passphrase from the
  keyboard for every hb command.  Even someone with access to key.conf
  cannot access your backup without also knowing this passphrase.

  The key.conf file has a new format to support passphrases.
  Old-style key.conf files are also accepted for compatibility.


  1. You still must make copies of your key.conf file!

  2. If you write your backup directly to a remote drive like Google
     Drive, the key will also be stored there.  To protect your
     backup, you MUST use a passphrase with the -p option to init.

- to increase security, pbkdf2 key stretching is used.  This may
  introduce a short delay (1 or 2 seconds) on every hb command.  Key
  stretching slows down an outsider's attempts to guess your key or
  passphrase by running both through thousands of one-way hashes.  Any
  attempt to guess a key / passphrase has to repeat all this hashing
  work for each guess.

  All of HashBackup's security comes from your key.  This is why hb
  init creates a random key by default: it is next to impossible for
  someone to guess a long random key.  Here are some suggestions for
  creating a strong passphrase to further protect your key:

  1. make up a sentence that you will remember and use this as your
     passphrase.  A sentence is easier to type than a password like
     Wjd0$p2^! and is stronger because it is longer.  Length wins over
     weird, hard to type, hard to remember passwords.
     Example: the fat green martian landed his shiny silver spacecraft

  2. make up a sentence that you will remember and use the first
     letter of each word as your passphrase.  For the first sentence
     in this paragraph, that would be: muastywrautfloewayp
  3. adding special symbols (other than spaces) will increase the
     passphrase strength.  One easy way to do this is to use special
     symbols before, after, and/or between words, for example:
     But make up your own special symbol rule.  Even adding just
     one special symbol will increase your passphrase strength.

  4. adding a number, especially in the middle, will increase your
     passphrase strength

  5. use a password manager program.  These store lists of passwords
     and passphrases in an encrypted file, protected by a master
     passphrase.  They often have password generators built in and you
     can cut and paste a passphrase when needed.

  6. To learn more about the importance and methods of choosing a good
     passphrase, do a search for:
     - strong password / passphrase
     - password / passphrase strength
     - password / passphrase entropy

- init: similar to -p ask, -p env can be used.  You first set the
  shell environment variable HBPASS to your passphrase.  For example,
  using the bash shell, you would say: export HBPASS='secret phrase',
  then run hb init -p env.  To make the environment variable
  permanent, export it in .profile in your home directory (for bash).
  Using -p env is less secure than -p ask, because every program you
  run has access to HBPASS.  But -p env is more convenient since you
  only have to set the environment variable once in your login
  session.  Setting HBPASS in .profile is probably less secure than
  typing the export HBPASS=mysecret command after you login.  If you
  do store your password in .profile, restrict file permissions to
        $ chmod ~/.profile 400
  The comments above about -p ask also apply to -p env.

- init/rekey: it is possible to use -p ask/env with -k ''.  This is
  less secure, because only the passphrase is used for encryption.
  But it is more convenient, because the key.conf file can be
  re-created more easily if it is lost.

- rekey: supports -p ask and -p env.  When using env variables for
  rekey, leave HBPASS set to the old passphrase before giving the rekey
  command; then update HBPASS with the new passphrase after rekey.

- rekey -k will now create a key.conf file if it is missing,
  however, no rekey occurs in this unusual situation.  This may be
  useful when recovering, to create a key file in an empty directory
  without using an editor program.

- rekey now recovers from interruptions.  If rekey is interrupted, no
  hb commands will work until the rekey is retried and completes.

- security: add a time delay if the key is incorrect.  User-generated
  keys and passphrases are not as strong as random keys and this delay
  may help slow down attempts to guess ("brute force") a key.  To
  prevent using the delay as an signal that the key is wrong, HB will
  also randomly delay even when the key is correct.

- backup: creates a new random backup key for every 64GB of backup
  data to avoid using the same backup key "too long".  Backup keys
  are managed by HashBackup and are not in the key.conf file.  In this
  release, a file bigger than 64GB will use only one key.

- recrypt: this is a new command that will re-encrypt backup data with
  the new encryption system.  Recrypt is different than rekey:

  -- rekey creates a new key in key.conf and re-encrypts hb.db; no
     archive files (these contain your backup data) are modified.
     This is used if the key.conf file may have been compromised.

  -- recrypt re-encrypts archive files containing backup data.
     Specifically, recrypt operates on archive files that were
     created before the last rekey command.

  To force re-encryption of all of your data, run rekey first, then
  run recrypt.


- recover: a public file signature (hash) is added to hb.db.n files
  sent to remote destinations.  This allows integrity checks on the
  remote side without the key.conf file, and it's also checked during

- recover: a private file signature (HMAC) is added to hb.db.n files.
  Recover verifies the HMAC to ensure that the file has not been
  changed during transmission or while it was on the remote.  Even
  before this release, tampering with encrypted hb.db.n files would
  have very likely caused a program fault during or after recovery;
  HMAC provides a stronger guarantee against undetected tampering.

- recover: a 2nd private file signature (HMAC) is added to hb.db.n
  files to ensure that the hb.db file created by recover is identical
  to the original.  This can detect more errors, for example, if an
  hb.db.n file is not applied, not applied correctly because of a
  software bug, or a valid hb.db.n file is copied over a different
  hb.db.n file on the remote side.

- get: multiple CPU cores may be utilized during a restore.  This is
  mainly beneficial for bzip2 compression, though restores with normal
  gzip compression are somewhat faster too.  A -p option is added to
  get, like backup; -p0 will use only 1 core.  The default is to use
  all cores.

- selftest: multiple CPU cores may be used, and the -p option was
  added.  The default is to use all cores.

- backup: when a fatal error occurred, backup would sometimes freeze
  after displaying the error message.  This has been improved, though
  may not be completely fixed.  It is a race condition while shutting
  down and hard to manage when an exception occurs.

- backup: the HB build number used to create each backup is now stored
  and displayed by the versions command.  It is not displayed for old

- selftest: renaming hard links a certain way could cause a selftest
  error, "for logid x, hlogid y is also hard-linked".  The backup data
  is actually okay - this was an bug in selftest.

- get: the previous hard link renaming scenario could cause restoring
  a renamed hard link to fail with a file size mismatch or a file hash

- mount: the previous hard link renaming scenario could cause access
  to a renamed hard link to fail with an error "Bad address".

- backup: hb can't yet handle ACLs on zfs, nfsv4, or Windows
  filesystems since their ACLs are not Posix compatible.  FreeBSD
  returns an error "Invalid argument" when these filesystems are
  backed up with hb.  Instead of printing this error on every file, hb
  will only print it once, and print a note that ACLs on this
  filesystem aren't supported.

- backup: on Linux, backing up sshfs filesystems caused a fatal error
  because sshfs returns the wrong error code when reading flags.

- selftest: could fail with an error on line 351:
    TypeError: not enough arguments for format string

- ls: display symlink targets with -l, like ls -l
- ls: display hard links with -lv

- get: if there was a problem reading a symlink target, get would
  abort; it should have printed an error and continued the restore

- raw block device / partition backup is supported.  Before, hb only
  saved the block device information you would see with ls -l.  Now, a
  block device pathname on the command line causes the contents of the
  block device to be saved.  For example:

      hb backup -c backupdir /dev/sda1

  would backup the first partition on disk /dev/sda.  You could also
  backup /dev/sda, which would save all partitions on the physical
  disk sda.

  Before doing a block device backup, make sure the device is not
  mounted.  hb does a basic check for this and won't backup anything
  that is displayed by df -l.  If you do backup a mounted block
  device, you'll likely have a corrupt device on restore.  Logical
  volumes can also be backed up using this method.  To get a clean
  backup without unmounting a LV, make a snapshot LV first and backup
  the snapshot rather than the actual LV.  To create a snapshot LV,
  there must be free space available on the same physical volume
  containing the LV (Linux).

  hb is not yet smart enough to backup only the used blocks in a
  partition.  It is safer (and easier!) to backup all blocks rather
  than reading the filesystem structures on the device to find used
  blocks, because hb doesn't need to know about filesystem details.
  But it can be slower if there is a lot of free space in the
  filesystem.  Image backups are very space-efficient if dedup is

  Dedup with a block size of -B4K works well with most filesystems,
  but the flipside is that this small block size does not compress
  very well.  Depending on the compessibility of your data and the
  amount of data changed, a larger block size like -B1M may work
  better.  Experiment with your actual data to decide which options
  are best.

  If you are backing up several block devices and want dedup to work
  across all of them, for example, each block device has a similar
  Linux VM, you will have to use -B4K, because even when different
  block devices contain the same data, the block placement is not

  hb get will also restore entire block devices.  The block device
  path must be given on the get command line for a full image restore.
  If --orig is used, the data is restored to the same device backed
  up.  To restore to a different device, use:

    hb get -c backupdir /dev/sda1 --todev /dev/sdb1

  The target block device must not be mounted.  If it is mounted, your
  restore will almost certainly trash any file system there, and the
  restore will also be bad because an active filesystem is writing to
  the same device.

  If neither --orig nor --todev are used, a file image is created from
  the backup as if you had done a dd from the device to a file named
  sda1 in the current directory.

  When restoring a block device with lots of free space, it may be
  faster to use -p0 to disable multi-core operation.

  On Mac OSX, the raw read block size is apparently fixed at 4K, so
  reading from a raw block device is slower than reading from the
  normal filesystem.

- backup: to avoid confusion, display a message if dedup is not
  enabled, and display dedup utilization statistics when it is
  enabled.  The statistics show whether -D would benefit from an
  increase in the dedup table size.

- get: when a file with special flags was restored, get would display
  an error: 'module' object has no attribute 'chflags' on BSD and OSX.

- selftest: could display an error message:
     Error: blockid n has version v
  where v is a version that was deleted.  The backup is fine and would
  restore correctly; this was a bug in selftest.

- backup: performance improved ~8% for small block sizes (VM images).

- ls: if -r was used to display a specific backup version and a
  directory was not backed up in that version but was backed up in an
  earlier version, ls would sometimes not display the directory at
  all.  Also, if a file in one backup was replaced with a directory by
  the same name in a later backup, ls -a would display the directory,
  but not the earlier file backup.  It should have displayed both.

#550 - April 9, 2012 - beta expires September 15, 2012

- backup: a new -Z option controls compression.  The default is -Z3,
  which is equivalent to the old behavior.  The possible values are:

    -Z0 = disable compression (replaces --no-compress option)
    -Z1-7 = gzip level 2-8
    -Z8 = bzip2 level 1
    -Z9 = bzip2 level 3*

  Dedup is not affected by the type or level of compression used, and
  different kinds/levels of compression can be itermixed in the same
  -c backup directory without problems.


  -- using -Z often triples backup time and doubles restore time,
     especially with bzip2.

  -- on a multi-core system, HB will use all cores with bzip2.  With
     gzip, HB will only use 2-4 cores by default.  You can raise this
     with -p but should probably experiment with your system first.

  -- disabling compression with -Z0 may be slower than -Z1 on files
     that are compressible, because more backup data is written; only
     use -Z0 if you are very sure that your files will not compress

  For even more control, there is an alternate -Z syntax:

    -Zgz   = use gzip at hb's recommended level
    -Zgz,n = use gzip at level n (1-9)
    -Zbz   = use bzip2 at hb's recommended level
    -Zbz,n = use bzip2 at level n (1-9)

  * In tests, higher bzip2 levels use more memory and take more time,
    but don't seem to improve compression ratios very much over level
    1; gzip level 9 takes longer but is rarely better than level 8.
    So don't use -Zbz,9 thinking you will get the best results; often
    it will just make your backup and restore take longer.
    Compression is always data dependent, so testing with your actual
    data is the only good way to select a compression type & level.

- the --no-compress option is obsolete and should be changed to -Z0.
  It will still be honored for a few months to give everyone time to
  change cron jobs, scripts, etc.  -Z has priority over --no-compress.

- backup: compression was often disabled when backing up files on
  single CPU systems or when -p0 was used.  This bug was just noticed
  but has been crawling around since #339.

- backup: if dedup is not initially used, but is used in later
  backups, a scan of all blocks was occurring to update the dedup
  table even though nothing would change.  This scan is now avoided.

#543 - April 5, 2012 - beta expires September 15, 2012

- compare: sometimes showed files as new even if excluded in inex.conf

#542 - April 3, 2012 - beta expires September 15, 2012

- get: permissions were set after a file was restored.  For large
  files that might take a while to restore, the permissions were lax
  during the restore.  Now the permissions bits are correct during the

- get: similar to above, directory permissions were set after a
  directory and all its contents were restored, and were too lax
  during the restore.  Unlike a file, directory permissions cannot be
  set correctly while the directory is being restored.  For example,
  if restoring a directory with r-x permission, it would not possible
  to restore the directory contents because w access is needed.  So
  during the restore, a directory's permissions will be set to rwx for
  the owner, none for others.  Then after the restore is complete, the
  correct permissions are set.

- during a sync operation, where archive files from a previous
  operation needed to be sent to remotes, files were not sent in

#540 - April 2, 2012 - beta expires September 15, 2012

- mount: reading files through an hb mount point was very slow for
  large files backed up with fixed blocks sizes.  Here are performance
  comparisons for a 10GB VM image saved with 4K blocks (smaller block
  sizes benefit more than larger block sizes, but both are faster):

  Read at offset 0:
    was: 258215424 bytes transferred in 30.039307 secs (8.595918 Mbytes/sec)
    new: 550768128 bytes transferred in 30.244884 secs (18.210291 Mbytes/sec)

  Read at offset 512M:
    was:  22285824 bytes transferred in 32.114154 secs (693.956 Kbytes/sec)
    now: 693599744 bytes transferred in 30.717192 secs (22.580181 Mbytes/sec)

  Read at offset 1G:
    was:   6557184 bytes transferred in 31.744395 secs (206.562 Kbytes/sec)
    now: 577826304 bytes transferred in 31.432984 secs (18.382801 Mbytes/sec)

#539 - March 31, 2012 - beta expires September 15, 2012

- backup: some future changes were committed by mistake in #537 and
  were backed out

#538 - March 31, 2012 - beta expires September 15, 2012

- mount: reading a file sequentially is twice as fast for files backed
  up with -D.  This is on top of the improvements in #537, where an
  n^2 algorithm was replaced by a nlogn algorithm.

#537 - March 31, 2012 - beta expires September 15, 2012

- backup: the config keywords no-dedup-ext, no-compress-ext, and
  no-backup-ext specify file extension of files that you don't want to
  dedup, compress, or backup.  The expected way of specifying these
  was: hb config -c backupdir no-dedup-ext 'jpg,jpeg'.  Matching of
  suffixes is case-independent.  The change in this release is that
  extensions can also be specified with spaces and/or with leading
  periods, so this is valid:
      hb config -c backupdir no-dedup-ext 'jpg jpeg, deb .iso'

- backup: if a backup is terminated with kill -9, it doesn't get a
  chance to clean up archive files; the next backup could print a
  negative number for the space used if the successful backup was
  smaller than the terminated backup

- hb would not upgrade rev 9 databases to rev 11; now it will

- security: hb would display a specific error message if a padding
  error was detected during decryption.  This can often be used in a
  "padding oracle" attack.  It doesn't exactly apply to hb because an
  attacker must be able to request decryption of chosen ciphertext,
  and hb will not perform decryption without a correct key file.  But
  as a precaution:
  -- hb will no longer distinguish between padding errors, decompression
     errors, and hash mismatches; all of these will cause a hash mismatch
  -- padding now uses random bytes

- if a directory destination (Type dir) in dest.conf didn't exist, it
  caused a confusing error message like:

    dest hb2: error in send arc.0.0: [Errno 2] No such file or directory: '/Users/jim/hbdir/arc.0.0.tmp'

  Now it will print a more direct error:

    dest hb2: Traceback (most recent call last):
      File "/", line 75, in loop
      File "/", line 22, in loopinit
    err: dir(hb2): directory doesn't exist: /Users/jim/hbdir

- if an error occurred while initializing a destination, the
  destination name was not always included in the error message

- mount: when a large file was backed up with -D, reading the file
  via an hb mount point would take a long time

#526 - March 1, 2012 - beta expires June 15, 2012

- backup: on BSD and OSX, hb indirectly used sysctl (a system command)
  to determine the number of CPU cores.  But sysctl may not be
  available to cron jobs with the default path, so PATH had to be
  changed in the crontab file.  A different method is used now that
  doesn't require setting PATH.
  NOTE: this change was supposed to go in release #512

- backup: display an error message rather than a traceback when
  non-integer values are used for keywords in dest.conf that are
  supposed to have integer values, such as a port number.

- get, ls: using -rX where X is a deleted version would cause a
  traceback.  Now it display an error that the version doesn't exist.

#520 - February 21, 2012 - beta expires June 15, 2012

- IMPORTANT NOTE: this release contains critical fixes to the recover
  feature.  Recover is needed when some or all of the local backup
  directory is lost, to recover data from a remote backup.  Everyone
  should apply this upgrade if using a remote backup (in dest.conf file)

- recover: sometimes an error could occur during recovery:
    Exception: Incremental db missing? 787521536 > 785825792: /hbtest/hb.db.163
  The backup data is all fine, but there was an incorrect test in the
  recovery code that is now fixed.

- recover: in an unusual situation where backups are running but a
  destination is unavailable (so no files are being transferred), then
  the destination becomes available and you recover from the
  destination before backup is able to sync the destination, it could
  cause the error:
    OSError: [Errno 2] No such file or directory: '/hbbackup/hb.db.229'
  This has been corrected.

- backup: socket pathnames caused a "Pathid xxx unused" error in
  selftest.  This is harmless and selftest would remove the unused
  paths, but the cause is now fixed

- selftest: in the previous release, selftest level 4 verified that
  every file block could be decrypted and uncompressed, and that the
  block and file checksums were correct.  For backups with a lot of
  dedup, VM backups for example, the same data block might need to be
  decrypted, uncompressed, and hashed many times to compute the file
  checksum.  Now, -v4 will only process each block once and will not
  verify whole file checksums.  -v5 will verify file-level checksums
  and is equivalent to -v4 in previous releases.  Running selftest
  without -v is the same as -v5, the highest level of checking.

- backup: HB is more efficient about storing hb.db.nnn files.  The
  first backup after installing this release may delete more hb.db.nnn
  files than usual.  Keep in mind that the hb.db.nnn file sequence
  numbers do not necessarily correspond to the backup version, ie,
  backup #5 may create hb.db.7.  This change will also make recovery
  times shorter.

#515 - February 7, 2012 - beta expires June 15, 2012

- backup: sometimes would report a databased locked error when
  synchronizing old backup files to a destination

#513 - December 10, 2011 - beta expires March 15, 2012

- extend beta expiration date to March 15

#512 - September 4, 2011 - beta expires December 15, 2011

- NOTE: this rev will do an automatic database upgrade to dbrev 11.
  Any previously backed up socket files are deleted during the
  upgrade, since these cannot be restored by hb get and generated
  errors during a restore.

- still working on the option of remote-only backups

- backup: backups on BSD failed with an error message: object has no
  attribute 'setbackup'

- get: in a large restore of thousands of files, an error could
  sometimes occur on a few files:

    Warning: partially restored file: (filename shown here)
    Exception: free variable 'cur' referenced before assignment in enclosing scope
    Continuing restore

- backup: on BSD and OSX, hb indirectly used sysctl (a system command)
  to determine the number of CPU cores.  But sysctl may not be
  available to cron jobs with the default path, so PATH had to be
  changed in the crontab file.  A different method is used now that
  doesn't require setting PATH.

- in #505, a 30-second timeout was added to rsync destinations.  On a
  very slow target (an unslung NSLUG2 NAS), rsync might repeatedly
  timeout when sending an updated archive after a retain or rm
  operation.  Now, the default timeout is 3600 seconds (1 hour), but
  it can be changed with the Timeout keyword in dest.conf for rsync

- ssh destinations sometimes failed with authentication errors when
  OSX or BSD tried to connect to a remote ssh server running CentOS
  5.5.  To improve compatibility, hb now uses the system sftp program
  instead of connecting directly to the remote ssh server.

- socket files were backed up in previous versions of hb, but these
  can never be restored and hb get would generate an error when it
  tried to restore sockets.  Sockets are no longer backed up.

#510 - May 22, 2011 - beta expires September 15, 2011

- development of the "remote only" backup option is not quite ready,
  so this very minor update is being issued to extend the beta
  expiration date to September

- a few doc files were updated

- backup -c <dir> to an empty directory (no hb init) creates hb.lock,
  but then hb init refused to run because the directory was not empty.
  Now, hb.lock is ignored by init

#505 - Apr 13, 2011 - beta expires June 15, 2011

- NOTE: this rev will do an automatic database upgrade to dbrev 10.
  Some items were moved from archives to the main database, so many
  archives may have items removed and your hb.db file may grow

- IMPORTANT USAGE NOTE: if a new destination is added to an existing
  backup, the new destination is not fully synchronized until after
  the next successful backup.  This has not changed, just making
  everyone aware

- COMPATIBILITY NOTE: in previous versions, the hb executable was
  copied to all remote destinations if it changed.  Now, the hb
  executable is only copied if the config variable copy-executable is
  True.  The default in this version is False, which is a change in
  behavior.  To re-enable this, use: hb config copy-executable True

- Google Storage for Developers, an Amazon S3-like service currently
  in beta and available by Google invitation only, is now supported.
  The destination type is gs, the dest.conf config variables are
  accesskey, secretkey, and bucket, as with S3 destinations, and the
  environment vars are GS_ACCESS_KEY_ID and GS_SECRET_ACCESS_KEY;
  environment vars are only used if accesskey and secretkey are not
  specified in dest.conf. For more information, see

  NOTE: Google Storage for Developers is not the same as Google
  Storage for Docs.  HashBackup does not yet support using Google
  Storage for Docs as backup space.

- S3-compatible services are supported with the new Host and Port
  config variables on an S3 destination.  This can be used with
  Eucalyptus' Walrus S3-like service for example.  Eucalyptus / Walrus
  is an open source S3 clone that provides an S3-like service.  For
  more information see

  NOTE: the Host and Port keywords are not necessary with destination
  types s3 or gs, and will default to the correct Amazon or Google
  values.  For other S3-like services, Host and/or Port are required.

- Rackspace Cloud Files storage is supported in this version.  The
  destination type is cf, Userid and Accesskey keywords are required,
  and Container is where you want your backup stored.  Unlike S3 and
  Google Storage, Cloud Files containers are per-userid, so there is
  no need to find a globally unique name.  Rackspace charges for
  incoming bandwidth are about half of Amazon S3 and Google Storage.
  For more information see

- rsync destinations will timeout after 30 seconds if the remote is
  unresponsive.  Before, it took a very long time for an rsync
  destination to timeout

- if a local archive file is missing during a get (restore), hb
  automatically downloads it as needed; this is not new.  Destinations
  in dest.conf should be listed fastest download speed first, and the
  fastest destination with the most up-to-date version of a file is
  the one that will be selected for downloading.  Before this release,
  the destination chosen was unpredictable.  This does not apply to
  recover (get all files) since recover uses only the destination you

- if a destination had a failure, hb would display an error and
  process the next request for that destination.  Now hb will try
  requests up to 3 times, and if they all fail, the destination is
  stopped and no more requests are sent to it.  hb will "catch up" the
  destination on the next backup.

- when multiple destinations were configured and an error occurred on
  one destination, the other destinations continued to work correctly;
  but once the working destinations were finished, a backup would
  sometimes "hang" waiting for the failed destination to finish (it
  never would finish since it failed)

- selftest -v3 sometimes displayed the backup size larger than the
  actual size.

- selftest: data structures created during selftest are more

- when backup is waiting for destinations to finish, it prints a list
  of destinations that are still busy with transfers, and updates this
  list as destinations finish their work

- display a message when the hb executable program is being copied to
  destinations; this can sometimes take a while, and it isn't always
  obvious why there is a longish delay

#487 - Mar 27 2011 - beta expires June 15, 2011

- S3 destinations could fail with a traceback like:
  File "/", line 38, in __init__
  File "/", line 26, in baseinit
TypeError: string indices must be integers, not str

#486 - Feb 20, 2011 - beta expires June 15, 2011

- in some situations, the db upgrade procedure of #485 could cause
  hb.db to be removed by mistake

#485 - Feb 15, 2011 - beta expires June 15, 2011

- NOTE: this rev will do an automatic database upgrade to dbrev 9

- the major new feature in this release is incremental transmission of
  hb.db, the main HashBackup(TM) database.  The major new feature of
  the next release is making optional the local copy of the backup.

- COMPATIBILITY NOTE (backup): the inex.conf file previously allowed
  excluding files from the backup and overriding those excludes with
  includes.  There were several bugs and points of confusion, so this
  was simplified by eliminating the include keyword: now, files can
  only be excluded.  To include a file that would be excluded by an
  inex.conf rule, list the pathname on the backup command line.
  Include processing may be added again later, depending on user
  feedback.  As a side benefit, the incremental backup file scan to
  find modified files is 10% faster.

- COMPATIBILITY NOTE (backup): in previous releases, data from a prior
  backup was often used for dedup even if the -D option wasn't
  specified.  Now, the -D option is required to enable dedup.  Backup
  data created *without* -D is not used to dedup future backups, in
  most cases.

- config: a new config parameter, no-backup-ext, has been added.  This
  is a list of filename extensions that should not be backed up.  For
  example, you might use:
       hb config -c /hb no-backup-ext avi,mov,o
  to skip backup of files ending with .o, .avi, and .mov.  This is
  faster than using exclude patterns like ex *.o in inex.conf.

- config: a new config parameter, dedup-mem, sets the default amount
  of memory to use for dedup operations.  This can be overridden by
  backup's -D option.  The default is zero, for no dedup.  In this
  release it's also possible to have dedup-mem set to some value you
  usually use, like 1gb, and use the -D0 backup command line option to
  disable dedup just for that backup.  See doc/ for more
  information about the amount of memory to use for dedup.

- a new command, hb init, is required to initialize the -c backup
  directory before the first backup.  This allows modifying exclude
  rules before the first backup, and allows setting the encryption key
  to a user-specified value using the -k option rather than hb
  choosing a random key.

  SECURITY NOTE: for higher security, it is recommended that you let
  hb init choose a random key string, as with previous releases.  Or,
  you can choose a long phrase that is easy to remember for your key,
  for example: my cat chases my dog.  For less security, you can use
  -k '', which specifies encryption with a blank key.  With a blank
  key, there's no need to store the key securely.  Spaces are removed
  from the key, so key abc def is the same as abcdef.

- a new command, hb rekey, can be used to change the database
  encryption key.  Usage is similar to hb init.  After the database is
  rekeyed, it is transmitted to any remote destinations you may have
  setup in dest.conf.  If you want some security but don't want to
  fiddle with storing encryption keys separately, you could rekey to
  an easily remembered phrase like 'my dogs name is spot'.  For less
  security, you could rekey to the blank key ''.

- backup: the Freq keyword is no longer supported.  This was used to
  defer transmits offsite, for instance, once per week.  The only time
  or bandwidth savings was for transferring the hb.db file, and this
  is now sped up in a different way.

- sending incremental backups offsite is much faster, especially if
  not many files were modified.  For rsync destinations, the first
  backup after the upgrade will be much slower than usual, while other
  methods (ftp, etc) will be about the same.  After that, it will be
  faster than usual for all destination types.

- clear: files stored on destinations are also removed; a new warning
  about this is displayed unless --force is used

- mount: when accessing the mounted HB filesystem, data near the end
  of a file would sometimes not be returned correctly.  The backup
  itself was fine - only accessing it via mount was affected.

- backup: dir destinations were creating the target directory, even
  though dest.conf.example said the directory had to exist

- backup: if a pathname on the command line is a symbolic link, backup
  saves both the symlink and its target.  For example, on BSD systems,
  /home is a symlink to /usr/home, so a backup of /home saves both the
  /home symlink and the complete /usr/home tree.  The new behavior in
  this release is that if a backup pathname *contains* a symlink, the
  pathname is resolved and the target pathname is used instead.  So
  for example, if /home/jim is saved, a message is displayed that the
  pathname was changed to /usr/home/jim, and this tree is saved.  No
  /home/jim pathnames will appear in the backup, since they don't
  actually exist in the filesystem.

- ls would not display the contents beneath a symlink to a directory.
  For example, on BSD systems, /home is a symlink to /usr/home.  If
  /home/jim was given to the backup program, all of /home/jim would be
  saved, but ls would only print / and /home.  This situation cannot
  occur going forward, because in this release, the backup program
  resolves the symlink in the pathname (see previous note).

- ls displays "(parent, partial)" when a directory is backed up
  because it is the parent of a file that was requested.  For example,
  if /Users/jim/x is backed up, /, /Users, and /Users/jim are also
  listed in the backup, all marked "(parent, partial)"

- get: if a file or directory being restored already exists, get
  prints a warning.  If the user interrupted the restore, deleted or
  renamed the existing file, and then continued the restore, get would
  fail when trying to delete the old object

- backup: sometimes a hard-linked file would be saved on every backup

- backup: create hash.db without execute permission bits

- rm & retain: if an archive was removed while some destinations were
  offline or inaccessible, it was not removed later when the
  destinations were accessible

- rm & retain: if one file such as /Users/jim/x is backed up and then
  removed, a small archive was left in the backup directory if there
  were extended attributes or ACLs on any of the parent directories

- error messages were sometimes being sent to stdout instead of stderr,
  which can be an issue for scripting hb

#426 - Dec 9, 2010 - beta expires March 15, 2011

- ls: failed when a file or pathname was listed on the command line.
  This bug first occurred in #408.

- versions: with no options, display the most recent 5 backups vs 1

- in #416, the upgrade command was changed to set the owner of the hb
  executable as it was before the upgrade.  But a typo caused the error:
     AttributeError: 'module' object has no attribute 'state'
  After this error, there will be an hb.tmp file in the same directory
  as the old hb executable.  To finish the upgrade, do:  mv hb.tmp hb

#416 - Dec 5, 2010 - beta expires March 15, 2011

- backup: if a single zero-length file was saved, the empty archive
  file created should have been deleted

- delay transmitting archives after a retain or remove if they don't
  shrink very much.  This was unintentionally removed at #408

#410 - Nov 29, 2010 - beta expires March 15, 2011

- in release #408, the help command didn't work if the fuse library
  was missing.  Now, only the mount command will fail when fuse isn't

#408 - Nov 26, 2010 - beta expires March 15, 2011

- COMPATIBILITY NOTE: the undocumented --no-dupcheck option to backup
  has been removed, as no dedup is now the default and has been for
  several releases.  To enable dedup, use the -D option, for example,
  -D1g will dedup using 1GB of memory.

- this version of HashBackup has a new archive format that uses up to
  9% less space to backup the same amount of data from VM images, and
  is also faster to create and access.  Old format archives are still
  supported and are converted to the new format when they are
  accessed.  You can leave old format archives on remotes, and when
  recovered, hb will convert them.  If you want to force all local
  archives to be converted immediately, use hb selftest -v3 after
  installing this new version.  The converted archives will not be
  uploaded to destinations unless they shrink much smaller than the
  remote archive.  If you want to force all converted archives to be
  uploaded, run selftest as described, then delete dest.db from the
  backup directory.  All archives will be uploaded during your next

  NOTE: during conversion, new format archives are created in a temp
  file and then replace the original.  Because it would double your
  backup space requirements, no copies of the original archives are
  kept.  If you want a backup of your old archive files, copy them
  before running this version of hb.

- a new command "config" will display HashBackup's config settings.
  These can also be changed with the config command to modify the
  operation of hb.  In this release there are several config settings:
  arc-size-limit: this controls how large individual archive files can
  grow before a new archive file is started.  The default is 1gb, as
  before.  The minimum value is 1mb and the maximum value is 8gb.  Be
  careful not to set the size larger than your remote destinations can
  handle, for example, Amazon S3 is limited to 5gb file sizes, so the
  archive size limit should probably be 4gb at most.  GMail has a size
  limit of 25mb, so a limit of 20mb should be used.

  no-dedup-ext: files with a suffix listed here will not be deduped.
  Any suffixes listed here will add to the built-in list that backup
  uses.  Suffixes should be listed with commas but without periods,
  for example: hb config no-dedup-ext bz2,bz

  no-compress-ext: files with a suffix listed here will not be
  compressed.  For example: hb config no-compress-ext avi,mp3

  Config settings are versioned and match the corresponding backup.
  This allows you to see the config settings used for any prior
  backup, and to revert to the config settigs of a prior backup.

- a new command "compare" will compare a filesystem path to a backup
  and display any differences.  Compare only supports comparison to
  the current backup version, though in the future it may be extended
  to compare to older backups ("What has changed since -r10?")

- the built-in help has been cleaned up

- if recover -a is used to fetch files from remote destinations, and
  .old files exist in the local backup directory, recover would warn
  that archive files exist and would be renamed, even if they didn't
  really exist; the .old files were confusing recover.  This could
  happen if recover was executed twice for example.  Everything
  worked, just the warning was sometimes incorrect.

- when using a GMail account to store backups, hb.dbz and dest.dbz
  would keep appending to their email conversation, instead of
  replacing the old files.  HB was never designed to work this way,
  and didn't work this way with other email providers.  Now, there
  will only be 1 hb.dbz and 1 dest.dbz, to save space in the GMail
  account.  Be sure to set arc-size-limit (see above) when using an
  email account to store backups.

- for future upgrades, hb upgrade will display only the changes since
  the version of hb that is already installed.  So for example, if you
  are at #406, miss the upgrade to #407, and do the upgrade to #408,
  it will display only the changes in #407 and #408 - not the whole
  change log.

- with release #339, the upgrade command sometimes had trouble finding
  and replacing the hb executable.  As a workaround, use:
     /path/to/hb upgrade

  to upgrade the executable, ie, a full pathname.  This was fixed in
  #346, and upgrade should work fine with that release.

#346 - Nov 3, 2010 - beta expires March 15, 2011

- the October build was created on newer versions of Linux and BSD 8.
  Unfortunately, on Linux it required glibc 2.7, but CentOS, RHEL, and
  other versions of Linux do not have this newer C library.  This
  version of HashBackup was built on CentOS 5.5 and should run on both
  older and newer versions of Linux.  The BSD version was built on BSD
  7, and should run on both BSD 7 and 8.

- new command "upgrade" upgrades your hb executable to the latest
  version.  This should be run from a userid with sufficient
  permission to replace the hb command.  For example, if hb is in
  /usr/local/bin, you may need to run hb upgrade as root.  The old hb
  executable will be renamed to hb.bak after a successful upgrade.

- backup: on Mac/OSX, files would sometimes be backed up that hadn't
  changed.  OSX changes ctime on files, and this is what hb used to
  detect changes.  This is fixed with a more detailed test using both
  ctime and mtime.

- backup: related to the previous issue, some sites may not want to
  trust mtime, because it can be set by user programs.  For example, a
  file's data can be changed and then mtime reset to its previous
  value; this makes it appear that the file data hasn't been changed.
  It's usually fine to trust mtime, but if your site does not, use the
  --no-mtime backup option and hb will compare the entire file to the
  previous backup when mtime is the same, to ensure the file data is
  also still the same.  --no-mtime will make your backup run somewhat

#339 - Oct 9, 2010 - beta expires December 15, 2010

- NOTE: this rev will do an automatic database upgrade to dbrev 8

- backup: a new variable block dedup method allows HashBackup to dedup
  more files, such as:
  - office document edits
  - tagged mp3, m4a (iTunes) files
  - email attachments
  - database dumps
  - uncompressed tar files
  - gzip --rsyncable files

  Variable block dedup is enabled when -D is used on multi-core
  systems, for example, -D1g to use 1gb of memory for dedup.

  Variable block dedup is not used:
  - for files smaller than 128K
  - when -p0 is used to disable multicore backup
  - on single-core systems

- backup: with rev #321, an error message:
      OperationalError: database arc is locked
  was sometimes displayed on incremental backups larger than 1GB.

- add the Port option to ftp destinations, userid and password are
  optional for anonymous ftp access

- a new hb command sha256 computes the sha256 hash for a file.
  This is handy if your OS doesn't have a built-in sha256 command.

- ls accepts a new -v option.  When combined with -l (long display),
  -v also displays the inode, ctime, and sha256 hash

- backup: on Mac/OSX, if backing up / and the current directory was
  not /, an incorrect error message was displayed, like:
     Pathname changed: / => /Users/jim/
  This bug was introduced in r288

- backup: if a single zero-length file was saved in version 0, or all
  backup data was removed with hb rm /, the next backup would
  immediately fail with an error

#321 - Sep 20, 2010 - beta expires December 15, 2010

- exclude /home/*/.gvfs for Linux

- backup: -B16k displayed an error message

- backup: fixed bug when writing to remote destinations:
    ImportError: No module named dump

- rm: the current version was always printed with the "you are not the
  owner" error message; now the correct version is displayed

#311 - Sep 3, 2010 - beta expires December 15, 2010

- NOTE: this rev will do a minor automatic database upgrade to dbrev
  7.  If you use -D, your first backup after this upgrade may be
  slower to get started while it rebuilds the dedup database.

- a 64-bit build of hb is now available for Linux and BSD; the Intel
  OSX build is already 64-bit.  The advantage is that 64-bit builds
  allow dedup (-D) tables larger than 2GB, and 32-bit compatibile
  system libraries don't have to be installed.

- backups with dedup (-D) are a little faster because of less I/O

- improved error handling for multi-core backups

- a new destination type, sftp, is added.  This is equivalent to the
  old ssh destination type, except now either will work with ssh
  servers that allow sftp access but may not allow terminal sessions

- get: redundant pathnames are an error, not a warning.  Better safe
  than sorry

- if HB was waiting for a yes/no question to be answered, ^z was used
  to suspend the program, then fg was used to start it again, an error
  about an interrupted system call would be displayed.  Now, the
  question will be repeated.

#293 - August 12, 2010 - beta expires December 15, 2010

- NOTE: this rev will do an automatic database upgrade to dbrev 6.
  The hb.db file may shrink up to 50% after this upgrade, because some
  data is moved into a separate database, hash.db.  The advantage is
  that when hb.db is sent offsite, it may be much smaller; hash.db is
  not sent offsite and is generated from hb.db when necessary

- backup: archive files were sometimes being transmitted twice

- backup: a new option, -p, uses multiple CPU cores to backup large
  (>128K) files.  This can speed up the backup of large files by 30%
  or more.  If the -p option is not used, HB will automatically use
  multiple cores when run on a multi-core system.  The -p option
  allows finer control of this feature:

    -p0 = do not use multiple CPU cores for backing up
    -pn = use n extra tasks
    no -p = use 2 extra backup processes on a multi-core computer

  You might be tempted to use -p8 on an 8-core system, but it could
  actually make your backup slower.  If you have very fast disks, you
  may want to try increasing -p to 3 or 4.

- backup: a new option, -D, controls data dedup.  This version of
  HashBackup uses a new dedup method that is much more scalable.  The
  value after -D is the amount of memory you want to use for dedup
  information.  The -D option may be tweaked in the next few releases.
  For detailed information, see doc/

  NOTE: in previous releases, dedup was enabled by default; now, it is
  disabled by default, because it requires some thought about the
  amount of memory to use for dedup.  HashBackup may still do some
  dedup on incremental backups, even if the -D option isn't used,
  especially with VM images, logs, mailboxes, and databases.

- backup: a new option, -B, controls the block size.  This can be 1K,
  2K, 4K, 8K, 16K, 32K, 64K (default), 128K, 256K, 512K, 1M, 2M, or
  4M.  Tests show that a large block size doesn't usually speed up
  I/O, but it allows HB to scale to larger backups when very large
  files are being saved.  A trade-off is that the dedup mechanism may
  not find as much duplicate data with larger block sizes.  With small
  block sizes, dedup is better, but overhead is higher and the backup
  may be slower.  The default block size is either 64K or 4K bytes,
  depending on the file type.

- backup: print backup directory space used just for this backup and
  in total

- backup: print compression statistics for this backup as both a
  percentage and compression factor

- backup: a new option, -m/--max-file-size n, will skip files larger
  than n bytes

- backup: display an error message if a directory destination is the
  backup directory.  This copied the backup onto itself and caused
  warning messages like:
    dir(destname): file changed during transfer: /hb/arc.0.77
  The backup remained intact, but there would be a lot of unnecessary
  disk I/O.

- backup: if -c backup target is a FAT filesystem, such as most flash
  drives, the initial backup would fail with an error message like:
    OSError: [Errno 1] Operation not permitted: '/mnt/hb/key.conf'
  HashBackup was trying to make the key file read-only, but this is not
  possible on a FAT filesystem.  Trying the backup again would succeed.

- backup: on Mac/OSX, strip trailing slashes on command line pathnames
  to prevent errors like:
    Pathname changed: /Users/jim/backup/rest => /Users/jim/backup/rest/
    Unable to stat file: No such file or directory: /Users/jim/backup/rest/

- get: if the same pathname was listed on the command line twice, or
  redundant pathnames such as /abc and /abc/def were used (because
  restoring /abc would also restore /abc/def), get would fail with an
  error on OSX like:

    Unable to hardlink: Operation not permitted: /Users/jim/hb-551625.tmp -> /Users/jim/xxx
    Not restored: /Users/jim/xxx

  On Linux, get would fail with an error about using bind --mount.
  The get command incorrectly believed the directories should have
  been hardlinked.  Now, a warning is printed about skipping the
  redundant pathname.

#256 - June 22, 2010 - beta expires September 15

- the correct README file is included

#255 - May 24, 2010 - beta expires August 15

- backup: copy hb program to local backup area as executable (755 mode
  instead of 644)

- get: when a directory specified on the command already exists, hb
  would correctly do the restore into a temp directory; but when it
  tried to remove the old existing directory, the remove could fail
  with a "Not a directory" error if the old directory contained a
  symbolic link to a directory.  ("directory" 6x - Ugh!)

- get: don't complain about existing symbolic links if they point to
  the correct destination.  This doesn't change hb's restore behavior,
  but avoids unnecessary error message displays.

- get: setuid and setgid mode bits were not being restored (these are
  for privileged commands like "mount"; see chmod command)

- get: as a security precaution, setuid, setgid, and the sticky bit
  are only restored when running as root.  Otherwise, a warning is
  printed for files with these bits set; see chmod command.

- get: if an error occurred during restore of one path, get would
  (incorrectly) say that there were errors in all subsequent paths,
  even though the paths were restored without errors

#254 - May 22, 2010 - beta expires August 15

- recover: non-rsync destinations could fail with the error:
  object has no attribute 'cursor'

- recover: could display the warning message:
  Unable to set mtime on hb.db: Cannot operate on a closed database.

- recover: now displays a summary warning if any files were not

- get: refused to restore / to a non-empty directory.  But this is
  sometimes necessary for a system rescue, so now this error is a
  warning.  As always, be careful when restoring files from a backup!

- get: if / was restored to some directory (not /), without using
  --orig, the pathname displayed for restored files was incorrect and
  symbolic and hard linking did not work correctly.  This situation
  typically occurs when booting from a CD to restore a crashed root
  filesystem that is temporarily mounted under /mnt.

#252 - May 11, 2010 - beta expires August 15

- NOTE: this rev will do an automatic database upgrade to dbrev 5

- recover: there was a bug in the new feature to remove unreferenced
  blocks from downloaded archives.  If you ever ran recover with
  version #249, run selftest -v3 to verify your backup's integrity.
  IMPORTANT NOTE: this is a critical bug in #249 and everyone is urged
  to upgrade

- when this build is installed, archives are scanned to remove any
  stale file data.  This is necessary because of a bug in version #249
  retain (see next item).  This scan will take some time for large
  backups.  If you have to interrupt it, it will restart the next time
  you use hb; ie, you won't trash your backup if it's interrupted.
  The hb.db file itself is not modified, other than to change the rev.
  The archive scan is removing orphaned data blocks and possibly
  compressing archives.  If any archives are sufficiently compressed,
  they will be transmitted on the next backup to your remote

- retain: sometimes the message:
    Unable to remove block xxxx: no such table: rmlist
  was displayed.  Retain still removed older versions of files, but
  the file data itself wasn't being removed from the archive files.
  The effect of this bug is that archives do not shrink when they
  should, but all backup data is intact.

- HashBackup's Amazon S3 destination now accepts any value for the
  Location keyword (S3 Region), without validating it.  The values and
  their meanings as of May 5, 2010 are:
  -- no Location, US, or blank location = US Standard.  Data will be
  stored on the east coast or west coast, whichever is closest
  -- us-west-1 = US west coast.  It costs more to store data using
  this name vs using the more generic US region
  -- EU = Ireland
  -- ap-southeast-1 = Singapore

- rm/retain: transmitting an older archive over rsync after a rm or
  retain uses less CPU time than with version #249

- selftest: if an error occurred with -v0 (just read the database
  file), the displayed error count should have been 1 but was
  actually a library error code value, like 256.

- selftest: added a new -v1 verify level that does not traverse each
  file's block info since this takes time for backups with very large
  files such as VM images.  -v2 is now like the old -v1:
    -v0: read each page of main database, like cat hb.db
    -v1: check database, don't traverse file blocks, don't read archives
    -v2: check database, traverse file blocks, don't read archives
    -v3: v2 + read all archive blocks and verify crc
    -v4: v3 + decrypt and decompress all data, verify block hashes,
     verify file hashes.  Like a restore, without writing to disk.
    -v9: v4 + low-level database integrity check
  As before, the default verify level is -v9.

- selftest: verify levels -v1 and -v2 (the new one) are a bit faster
  if the database isn't already cached in memory

- selftest: -v3 (old -v2) would sometimes display X GB verified, where
  X was much bigger than the entire set of backup data.  Related to
  this, -v3 (old -v2) may run faster, depending on your backup data

#249 - March 6, 2010 - beta expires June 15

- recover: remove unreferenced blocks from downloaded archives

- add /.hotfiles.btree to inex.conf for Mac/OSX

- if arc files exist on the local system and a recover -f command
  is issued, the existing arc files are renamed with a .old suffix.
  These .old files are now ignored during archive synchronization.

- backup: if a pathname ending in . was used on the backup command
  line, /Users/jim/backup/. for example, a message like:
     Pathname changed: /Users/jim/backup => /Users/jim/backup/.
  was displayed, and the stored pathnames also contained .

- rm/retain: with hundreds of archive files, rm and retain could run
  out of file descriptors when a large number of files were being
  removed, especially on systems like OSX where the number of open
  files is limited to 256 by default.  rm and retain now use just a
  few file descriptors.

- backup: exclude /private/tmp/ on OSX (/tmp symlinks here).  To
  update an existing inex.conf, add ex /private/tmp/

#246 - January 4, 2010 - beta expires March 15

- INCOMPATIBILITY NOTE: the -n option (dry run) has been removed from
  the retain command, but the longer forms --dryrun and --dry-run are
  still available.  This is in preparation for -n to mean "don't
  transmit the database", as with the backup command, to allow retain
  to run more than once before transmitting hb.db

- NOTE: this rev will do an automatic database upgrade to dbrev 4

- beginning with this release, HashBackup releases are identified by
  the build number rather than a version number

- the Linux build of 0.9.10 failed: ImportError: No module named acl

- in some cases, the database upgrade in 0.9.10 could fail with:
      TypeError: 'NoneType' object is unsubscriptable

- if the backup command did the database upgrade (vs any another hb
  command), it would take 10x longer than it should have, because IO
  buffering was disabled.  On a Fedora test machine, the upgrade took
  20 minutes with the backup command, but only 2 minutes with this fix

- if the backup command did the database upgrade, it would then fail
  with an error like "not a database or encrypted: arc.x.x-journal".
  The next backup command would work.  This was a bug in the archive
  synchronization procedure

- the change in 0.9.9 to use multiple CPUs to prepare the database
  wasn't actually enabled in previous beta builds

- VMWare memory images, *.vmem, are now excluded when a new inex.conf
  is created.  For existing inex.conf files, add an ex *.vmem line

- ls: was looping when a wildcard filename like '*.vmem' was used

- selftest: added multiple levels of selftest, taking increasing time:
    -v0: read each page of main database, like cat hb.db
    -v1: database consistency; no archive files are read
    -v2: v1 + all archive blocks are read and crc verified
    -v3: v2 + all data decrypted and decompressed, block hash verified,
     file hash verified.  Like a restore, without writing to disk.
    -v9: v3 + low-level database integrity check
  The default level is -v9.  This checks everything possible, as in
  earlier versions of selftest.  On my MacBook, with 45GB backed up,
  using 22GB of backup space, the verify times are:
    -v1:  4 minutes
    -v2: 24 minutes
    -v3: 71 minutes
    -v9: 75 minutes
  For comparison, it takes 10 minutes just to read the entire 22GB of
  backup data from disk at the maximum speed of 35 MB/sec with the
  command: time cat /hb/* >/dev/null

- selftest: delete unused paths with -v1 or greater; this is normally
  not necessary, but unused paths may occur in some circumstances

- rsync destination: the Dir keyword is checked to make sure it has
  the proper format, specifically, that it contains a : or ::

- rsync destination: HB was always adding /filename to the end of the
  Dir keyword to form the target path, but if the Dir path ends in :
  then a slash should not be added

- rsync destination: added debug keyword.  If value is 1 or more, the
  rsync command line is printed and -v is added so that rsync is more
  verbose.  If debug is 2, -vv is added (even more verbose), etc.

- rsync destination: improved transfer efficiency, esp for rm and retain

- S3 destination: added debug keyword.  With a value 1, data being sent
  to and received from Amazon S3 is displayed.  With a value 99, any
  exception during a file transfer will cause a traceback and HB will
  hang (use Ctrl C to terminate it).

- S3 destination: added a DNS lookup during startup to display a
  better error message when a system's DNS is not configured correctly

- backup: version 0.9.10 would fail on very long (>1023 bytes) symlinks,
  ACLs, and extended attributes with the error message:
      TypeError: an integer is required

- retain: added directory retention.  Previously, retain only removed
  files, which could leave empty directories in the database

- made "hb restore" an alias for "hb get"

- rm and retain now overlap archive compression, archive transmission,
  and database compression

- better cleanup of archive journal files

- rsync destination: workaround rsync bug: rsync 3.0.4 client (PCBSD
  7.1.1) with rsync 2.6.9 server (Mac OSX) gives "unknown option"

0.9.10 - November 21, 2009

- this version has a database format change and will automatically
  upgrade your backup database the first time hb is used.  All backup
  data is maintained, except extended attributes (SELinux); they will
  be saved again on the next backup

- ACLs are supported on Mac (OSX) and BSD systems (Linux ACLs were
  already supported)
  NOTE: OSX 10.5 (Leopard), FreeBSD 7.1, and PCBSD 7.1 have an
  operating system bug that causes a small memory leak for every ACL
  restored.  A patch to fix this was committed in the FreeBSD tree

- the mount command (FUSE) is available on FreeBSD/PCBSD

- mount: reading files from a mounted backup (FUSE) was sometimes
  extremely slow and CPU intensive because of a bad database query

- on OSX, filenames are case-insensitive; but if /users/jim is used
  on the backup command line, it must still be saved as /Users/jim,
  and HB will print a notice:
  Pathname changed: /users/jim/backup/x => /Users/jim/backup/x
  NOTE: HB exclude/include processing is always case sensitive

- on BSD & OSX, file flags are saved/restored like Linux version (see
  man chflags)

- on BSD & OSX, extended attributes on symbolic links are now saved
  and restored.  (Linux symlinks cannot have extended attributes)

- get: on BSD & OSX, symbolic links with a different mode than their
  link target now have the correct mode after a restore

- queue database to transmit next when the backup is finished.  If
  transmitting all archive files takes a long time (days or weeks for
  a huge backup), there will be a database saved on the remote side to
  restore the archives that did finish transmitting

- if the backup database doesn't exist but the compressed database
  does, HB will ask if you want to expand the compressed database.
  This is useful to run selftest directly on a destination directory,
  for example, an external USB hard drive.  Or, if the disk area
  storing the database itself goes bad (very unlikely, but possible),
  the compressed DB file can be expanded and used instead

- recover: fix 1% failure with index out of range error

- recover: print numbers so it's clear that recover isn't stuck

- a problem restoring a symbolic link or extended attributes could
  cause the get command to abort.  Now it will print an error message
  and continue the restore

- some HashBackup data files had x (execute) permissions

- a directory could be saved without its extended attributes (SELinux)

- destination handlers were sending error messages to stdout vs stderr

- when key.conf is created, a second line is written with spaces
  every 4 hex digits to make the key easier to copy by hand

- backup: if a pathname requested for backup is a symbolic link, for
  example, /home points to /usr/home on FreeBSD/PCBSD, the symbolic
  link's target (/usr/home) is added to the backup with a notice:
    Adding symlink target to backup: /home -> /usr/home
  This prevents the serious mistake of believing the files "in" /home
  are being backed up when in reality, only a symbolic link to
  /usr/home would be backed up.
  IMPORTANT: "symlink following" only occurs for command line paths!

- the rsync destination now accepts a port keyword, to allow the rsync
  daemon to run on a port other than the standard port 873.  This only
  works with rsync modules, ie, two colons used in the dir path.

- cache sizes have been scaled back in this version; determining the
  optimum cache size needs further study

0.9.9 - October 27, 2009

- add Password keyword to rsync destinations, to set the rsync module
  password when using the two colon form of rsync (direct to rsyncd)

- expanded documentation for rsync destination in dest.conf.example

- preparing the database for transmission is 35% faster on multi-core

- ls: added a note, noaccess, when directory contents can't be shown
  because of insufficient permissions

- ls: improved performance 10% when listing specific files

- Amazon S3 uploads would sometimes fail with "bad marshal data" or
  "No parsers found", depending on the system's configuration

- Amazon S3 uploads would sometimes fail with an error like:
      s3(xxx): sending <pathname>: sent XXX of YYY bytes
  where XXX was much greater than YYY.

- native FreeBSD/PCBSD build added to beta site

- removed timeout code from all destinations; it caused some problems,
  especially with FTP, and wasn't very useful since each destination
  runs in its own thread

0.9.8 - October 24, 2009

- repeat ad infinitum: test more before release, test more before release, ...

- the rsync timout was set too low, causing the next file transfer to
  start, concurrently, every 15 seconds

- fixed a selftest bug: OperationalError: no such column: blockshas.sha
  The database was fine - this was a bug in the selftest code

- the recover command didn't work with an rsync destination

- fixed KeyError problem on PCBSD when sizing memory

- fixed "Unable to read flags" error on PCBSD, in Linux compatibility
  mode; file flags (man chflags) are not yet supported on BSD / Mac

- ACL's are not yet supported on BSD / Mac

- if FUSE wasn't installed, mount would throw an exception

0.9.7 - October 22, 2009

- mount: on OSX, the umount command is used to unmount FUSE filesystems

- added Intel binary for Mac; 0.9.6 was compiled only for PowerPC and
  ran in emulation mode on Intel

- new destination type: rsync; see dest.conf.examples

- Amazon S3 was stepped on in 0.9.6, but is fixed

- database was being prepared for transmission even if it was deferred
  on all destinations; changed to avoid unnecessary work

- the backup program is copied to the backup directory if necessary

- imap (email) and S3 connection handling is improved

0.9.6 - October 16, 2009

- IMPORTANT NOTE: this version has a database change; use hb clear to
  remove beta test backups created with earlier versions, or create a
  new backup directory to use with this release.  The database format
  will be forward compatible at release 1.0

- improved scalability for backups >100GB

- backup: saving VM images (.vmdk, .hdd, .qcow2, etc) will use more
  disk space for the initial backup, but incremental backups will be
  much smaller for typical work loads

- space: document this 0.9.3 command on the beta site

- space: performance improved 5x

- get: regulate memory usage for large restores with millions of hard
  links (this change was for a 500GB restore with 31M files, more than
  half of which were hard links)

- ftp: changed block write timeout from 30 seconds to 2 minutes

- ftp: display a message when a timeout occurs

- dir destination expands ~jim in directory name

- a default inex.conf file (include/exclude) tailored for each
  computer system is created on the first backup.  Create an empty
  inex.conf or edit the file if you don't want the default exclusions

- initial backups directly to NFS are ~15% faster, but still slower
  than backing up to a local drive.  Incremental backups with few
  modified files are fast, comparable to backup on local drives

- added check for unrecognized arguments to commands

- fix typo in mount message: fusermount to unmount, not fuseumount

- added destname to recover command's help display

0.9.5 - September 14, 2009

- backup: if control-c was pressed at exactly the right time on the
  first backup, the database could be only partially initialized

- selftest: error counter was not always incremented, so error
  messages could sometimes be displayed but not counted

- selftest: detect missing root pathname record in hb.db

0.9.4 - September 7, 2009

- recover: if hb.db exists in the target directory but is not
  readable, for example, it's empty, recover would say "run a
  backup first"

- recover: a change in 0.9.2 caused the recover command to fail
  with a "transactions cannot be nested" message

- directory destinations: when copying to a directory, .tmp files
  would be left if the target disk runs out of space

0.9.3 - August 25, 2009

- clear: remove journals too

- backup: uses half as much memory to track hard links

- backup: backup huge directories (15M files) in ~50MB of memory

- backup: incremental backup huge directory on 1GB test machine, 15M
  empty files with 32000 hard links in 45 mins vs 105 mins

- new command "space" to show how backup space is being used

0.9.2 - August 21, 2009

- backup: reincarnated memory savings from version 0.3 for incremental
  backups to improve scalability on huge directories (>1M files)

- backup: remove warning "Unable to stat file" on deleted files

- backup: the built-in excluded path list (/proc, /tmp, etc) was
  removed because it caused confusion with /tmp backups, and /proc
  and /sys were excluded anyway as separate filesystems

- backup: hitting control-C at just the right time after starting a
  backup could cause a database to be half-initialized

- backup: hitting control-C during a backup could cause selftest to
  display a warning about high reference counts

- retain: if backup is run with -n (no transmit), then retain must
  transmit the database even if retain didn't remove any files

- backup: if a file had extended attributes with names containing
  characters >= 0x80, no extended attributes were saved for that file

- mount: fix Bad address error when trying to read ACL's on the fuse
  root or next-level directories

- mount: extended attributes fix

- mount: an error message was incorrectly sent to stdout vs stderr

0.9.1 - August 14, 2009

- initial FreeBSD compatibility testing.  This is not a FreeBSD native
  build yet and still relies on the Linux compatibility layer.  There
  has only been very light testing, but backup, versions, ls, and get
  appear to work fine with only 1 minor change so far

- don't try to read file system flags on FreeBSD systems

- S3 is not working yet on FreeBSD, but all other destinations have
  been tested and seem to work (FTP, ssh, IMAP, Gmail, directories)

- clear: added -f option to force clear; used by test programs

0.9 - August 9, 2009
NOTE: the 0.9.x releases will be for important fixes, in preparation
      for the 1.0 release, and instead of expiring in 1 month, these
      beta releases will expire in 3 months (only the backup command
      expires; backup data is still accessible anytime)

- database is 7-10% smaller.  Run hb clear first to remove existing
  beta test backups.  The database format will be stable and forward
  compatible at version 1.0

- mount: improved performance of file open and close when the backup
  is mounted as a filesystem

- get: if file1 and file2 were hard linked, restoring file2 w/o file1
  and not using --orig would cause a checksum mismatch and an empty
  file was restored.  Now, file2 is restored correctly, but will not
  be hard-linked to the existing file1; to cause the restored files to
  be hard-linked either use --orig or restore both files together

- get: in some circumstances, a file that was hard-linked would be
  restored without being hard-linked

- get: restoring a symbolic link that was also hard-linked could raise
  an exception

- get: restoring a hard-linked file with extended attributes could
  cause an exception

- mount: supports extended attributes (SELinux, ACL's)

- backup: if HASHBACKUP_DIR environment variable was set but was blank
  or empty, backup files were written to the current directory.  Set
  the environment variable to . to get this behavior

- if /var/hashbackup exists but the current user doesn't have write
  access, hb would stop with an insufficient access message.  Now hb
  will ignore this directory if the current user doesn't have access,
  and offer to create ~/hashbackup as usual

- backup: display a message after waiting 5 seconds for destination
  copies to finish, add a stat line for wait time

- add write lock to ensure single user write access to backup data
  for backup, clear, recover, retain, rm, and selftest commands

- versions: if a backup was interrupted, the versions command would
  sometimes print the current time as the backup's ending time

- retain: now refuses to run if the previous backup didn't finish

- retain: add -f/--force option to override previous safety feature

- retain: an error in the -x option, for example, -x30p, printed
  an error message (correct), but retain would run anyway (incorrect)

- retain: time-based retention (-t option) was based on the current
  time, ie, -t7d meant "in the last 7 days"; now it is relative to
  the last backup's finish time.  This prevents the problem where
  backups haven't been run for a while, then a retain is run and
  removes all but the most recent backup of current files because
  the backup is very old

0.8 - July 31, 2009

- removed internal nice -19, because a backup that took 10 secs
  on an idle machine took 790 seconds when one CPU-bound program
  was running; let users do nice -19 or ionice -c3 if needed

- mount: reading large files was very slow in ver 0.7

- backup: add -n option to defer copying the main database to
  destinations.  If retain is going to be run immediately after the
  backup, retain will upload the database

- add -v (verbose) option to backup; default level is -v2. -v3 will
  print paths either excluded or with the "no dump" attribute set,
  v1 prints no filenames, v0 prints no statistics

- add -v (verbose) option to get; default level is -v2

- get: ask whether to remove the partially restored file when a
  control-c is pressed

- get: display a warning that the current file was only partially
  restored if an error occurs during restore

- for a commandline like hb ugh /home, "unknown command: ugh" was
  displayed (correct) followed by "Unrecognized command: /home"

- backup: if unable to stat a file, print full pathname

- backup: better handling of hard-linked files that change during the
  backup.  Because of this change, databases created before 0.8
  may fail the selftest; use hb clear to remove beta test backups

- backup: bypass exclusion checks when saving parent directories
  of a requested file

- backup: exit code was the number of errors, but should have been
  either 0 for no errors or 1 if there were errors

- versions: align columns for neater output when userid varies

- backup: add number of files excluded to statistics

- get: trap exceptions and if --orig wasn't used (ie, we're restoring
  to a temp file), keep going.  If restoring with --orig, ask before
  continuing the restore

- get: if there were any errors during the restore, ask before
  replacing the original file or directory

- ls: display root path too

- get: print full pathnames instead of filenames

- recover: after recovering the database from a remote site, it was
  functional but was larger than the original

- open: the command to set the archive size limit isn't available yet
  (the limit is set to 1GB).  Backups to IMAP servers may need to
  lower this limit, and huge backups may want a higher limit

- ls: deleted files in earlier versions were not being displayed, even
  with -a

0.7 - July 25, 2009

- NOTE: database schema has changed - run hb clear if you have
  test backups from previous beta versions.  The database schema
  will be forward compatible beginning with the 1.0 release

- database uses up to 25% less space for very large files

- database uses much less space for virtual machine disk images

- get: verify file SHA hash matches after a restore

- mount: fixed error message when mount directory doesn't exist

- mount: fixed Bad address error when accessing non-current backups

- backup: after the first backup finishes, display a large notice
  about copying the key.conf and dest.conf files to safe locations

- changed some common error messages to prevent traceback displays

- backup: added the line number to exclude/include error messages

- backup/ls: fifos and devices were listed as partially backed up

- backup: a development assertion failed when hard-linked files had
  the "nodump" chattr attribute set or were not readable because of
  permission restrictions

- get: sparse files ending with zeros were not restored correctly

- get: verify with OS that a restored file is the correct size

- get: verify sparse file hash with a separate read pass after restore

- review and start to standardize error message displays

- selftest: didn't correctly handle a symbolic link that was also a
  hard link (yep, you can actually do this with Linux/ext3)

- mount: generated an error when accessing a symbolic link that was
  also a hard link

- mount: could return all zeroes when reading a hard linked file

- mount: now returns EIO if there is a problem reading a file

0.6 - July 18, 2009

- simplified retain -t and -x to only accept 1 time option, not NyNm

- new Freq keyword for destinations, like retain -t; defers copy
  until enough time has passed since the last copy to this destination

- bug fix in ssh destination when target directory did not exist

- ftp copy could leave a file open if an error occurred

- dest.db was being sent even if a destination was deferred

- dest.db is encrypted before sending offsite (other files already are)

- rm displayed "Removing all files from version x" when -r was used,
  but should have displayed "Removing requested files..." if paths
  were also listed

- rm and retain may defer archive uploads to save bandwidth

- rm was sometimes leaving a few blocks that should have been removed

- added block consistency tests to selftest

- get would stop on files with attributes that require root privilege
  to restore, for example, journal mode (j).  An error is now displayed

- get from a specific version (-r) would fail on directories

- get restores all directory attributes if parent directories have to
  be created with --orig (Ex: get --orig /a/b/c but only /a exists)

- get --orig failed with "Not a directory" when restoring /a/b/c, /a/b
  already exists, but b is a file and not a directory.  This still
  fails (it has to fail), but with a better explanation

- get will download an archive file from a destination if it is missing

- help command added

- get: if a and b were hard linked, only one was restored with --orig,
  and the other already existed, they weren't linked after the restore

- backup: added /tmp to the platform excluded directories

- get: clearer error message for pathnames ending with slash

- get: fixed existing file mtime check when multiple files restored

- get: instead of a warning, refuse to restore file over existing
  directory, or directory over existing file

- get: instead of a warning, refuse to restore a partially backed up
  file/directory over an existing file/directory, unless the existing
  directory is empty

- get: instead of a warning, refuse to restore / into a non-empty

- get: for safety, removed -f (force) option

- prevent tracebacks when expected error messages are displayed

- renamed to HashBackup

- added -a/-all option to mount, to allow all users access to the
  mounted backup filesystem.  Standard Unix permission checks are
  still performed on all accesses within the backup filesystem
  NOTE: by default, -a is only allowed by root, but it can be
  enabled for others with a /etc/fuse.conf setting

- backup: removed /mnt from internal exclusion list; /mnt is still
  skipped if a filesystem is mounted, unless it is listed on the
  backup command line

- backup: the backup directory contents were automatically excluded,
  but the directory itself was not.  This could cause the version to
  increment with one file changed, even if nothing else changed

- recover: if only 1 destination is setup, use it if none is specified

0.5 - July 7, 2009

- **NOTE: remove dest.db before using this rev

- backup/rm/retain: database transmit is 3-4x faster

- mount: new -f/--full option to show full backups in each version

- new destination type: ssh (see dest.conf.example)

- mount: if the backup directory was mounted, the backup, rm,
  and retain commands would fail with "Database is locked".

- mount: accessing a file that didn't exist caused a "Bad address"
  error instead of the correct "No such file or directory"

- rm/ls: wrong version displayed for the first file of a backup

- rm: didn't copy database to remotes if it didn't need to be compressed

- rm: don't display "Remove logid ..." - it's slow on very large removes

- selftest: display file counts instead of log id's - confusing

- all: prevent stack traceback when piping output into head

- backup: revert backup/memory reduction in 0.3 because of a database
  limitation: backup would fail after 5 minutes

- backup: remove empty archive if backup is interrupted

- backup: backups larger than 1GB fixed - nextarc typo

- recover: -f option fixed

- recover: no longer prompts for confirmation if no action would be taken

- backup: removed features to simplify code testing:
  removed -n option (dry run) from backup
  removed raw device backup
  removed --log option for separate log files (capture stderr instead)

- backup now prints a message when it skips a directory because the
  "no dump" attribute is set

0.4 - June 28, 2009

- mount command is available to view backups as a filesystem
  (requires Linux fuse kernel module, fusermount,

- s3 dest.conf accepts a new Location EU keyword to create European buckets

- selftest: verifying 11M files required 650MB of virtual memory,
  but now ~325M files can be verified in 650MB

- -c option failed - code typo

- using environment variable PALBACKUP_DIR failed - code typo

- readme: permissions should be 0700 on backup directory, not 0600

- readme: removed --force-full documentation; the option is
  still there, but it's mostly just confusing to new users

- readme: ls -a never required selection strings

- readme: retain Nn time means minutes, not seconds

0.3 - June 26, 2009

- backup: dumb error - extra comma caused immediate failure

- backup: decreased memory requirements of incremental backup for 1M
  file directory from 271MB to 110MB for better scalability

- ls: -r option was not showing any results

0.2 - June 26, 2009

- explain GET and RECOVER commands in README

- backup: fixed immediate abort with -n

- backup: increased speed of version 1+ backups 13x
  for large directories

- retain: --dry-run was incorrectly forced on

- selftest: fixed size display for zero length files

0.1 - June 25, 2009

- first beta release