Stats

Displays statistics about a backup.

$ hb stats [-c backupdir] [-v]

The -v option displays notes to explain each statistic in detail.

Example

This example is from the HB development server, an OSX system with 5 or 6 virtual machine images of 5GB each, 6 or 7 1GB virtual disk images, and the usual OSX system software. This system is backed up every night since June of 2011 and retain is run every night after the backup with -s30d12m (keep the last 30 days of backups plus 1 backup each month), so there are 42 versions of all VM images.

$ hb stats -c /hbbackup -v
HashBackup #2527 Copyright 2009-2021 HashBackup, LLC
Backup directory: /hbbackup

               3,608 backups

The number of backups stored.  Removed versions not counted.

              333 TB file bytes checked since initial backup

Total bytes that have been presented to the backup program, also
sometimes called "bytes in" in other backup programs.  Removed
versions are not counted.  For example, if a directory is saved that
contains a 100K file x, bytes checked would be 100K for version 0.  If
a 200K file y is added and the directory is backed up again, bytes
checked would be 300K for version 1.  Bytes checked since initial
backup is the sum of bytes checked for each version, or 400K in this
example.  Another way to look at this is the sum of file sizes for all
backups if each had been a full backup.  This is used to compute the
"industry standard dedup ratio" below.

               80 TB file bytes saved since initial backup

Total bytes saved by the backup program, beginning with the first
backup.  Removed versions are not counted.  Another way to look at
this is the total bytes read from the filesystem for all backups.  For
the previous example, if 100K file x was backed up in version 0, then
200K file y was added and backed up in version 1 (x was not modified),
file bytes saved since initial backup would be 300K.

                 908 total backup hours

Time spent on all backups.

               79 GB average file bytes checked per backup in last 5 backups

Average bytes presented to the backup program per backup in the last 5
backups.

               23 GB average file bytes saved per backup in last 5 backups

Average bytes saved by the backup program per backup in the last 5
backups.  Some of these bytes may no longer be stored if files have
been removed.  For the previous example, file bytes saved is 100K for
version 0, 200K for version 1, and the average would be 150K.

              29.76% average changed data percentage per backup in last 5 backups

Average percentage of data that was changed (and saved) by the backup
program per backup in the last 5 backups.

             13m 34s average backup time for last 5 backups

Average backup time for the most recent 5 backups.

                 632 archives

The number of arc files.  Arc files may all be local, all remote, or
some combination of local and remote.  Arc files are only counted
once, whether local, remote, or on several remotes.

               55 GB archive space

Bytes used by all committed arc files.  Some of these bytes may be
inactive (deleted) if files have been removed or retain has been run
and the archive has not been packed.  This is also the sum of ls -l
for all committed archive files, both local and or remote.  It
includes deleted bytes and encryption overhead required for each
block.  There may be archive files present in the backup directory
that have not been committed; these will be deleted at the beginning
of the next backup.

              83.74% archive space utilization 46 GB

Shows how tightly "packed" the archive files are: higher percentages
indicate better space utilization. 100% means there are few/no deleted
blocks in the archive.  If the percentage is low, indicating there is
a lot of free / unused space in the archives, and you want to recover
it, you could increase the pack-percent-free config variable to cause
archives to be packed sooner.  However, packing archives more often
will take more time and packed archives have to be uploaded again.
To pack a remote-only archive, it must be downloaded, packed, and then
uploaded; so there is a trade-off between disk space savings and
download + upload bandwidth costs + processing cost.  The default for
pack-percent-free is 50: archives are packed when they have 50% or
more free space.

               808:1 industry standard dedup ratio

This "dedup" figure is displayed to allow comparing HB with other
backup programs that compute their dedup ratio as the sum(bytesin) /
sum(bytesout) for all backups.  Since compression and traditional
skipping of unmodified files is also counted here as dedup, this ratio
is not a good measure of actual dedup processes.  This ratio is
computed as if every backup was a full backup, so even traditional
backup programs that only skip unmodified files and have no
compression nor dedup (plain incremental backups) could show a
reasonably high dedup ratio.  It is computed as the sum of file bytes
checked / backup bytes written for every version.

               12 MB average archive space per backup for last 5 backups

Average bytes per backup used by arc files in the last 5 backups.  It
includes active and inactive data and encryption overhead.  Another
way to look at this is the sum of the ls -l size of all archive files
in recent backups.  This can be used for backup space planning.

              1887:1 reduction ratio of backed up files for last 5 backups

Shows the average space reduction for modified files in the last 5
backups.  It can be used for backup space planning and is computed as
avg file bytes stored / avg arc space.  If you are planning to add N
bytes of file system space, you can divide N by the ratio to estimate
how much backup space will be required.  This assumes the new data is
similar to existing backup data, ie, a similar number of bytes are
modified before each backup, the data compresses the same, and dedups
the same.

              234 MB dedup table current size

The maximum size of the dedup table is set with either the config
variable dedup-mem or the -D backup option.  HB does not create a
maximum size dedup table until it is required.  The size shown here is
the current size of the dedup table, which is often less than the
maximum size.

           9,529,780 dedup table entries

The dedup table contains entries for every unique data block seen when
the backup program is used with the -D option or dedup-mem is set.
When -D0 is used, no entries are added to the dedup table.

                 48% dedup table utilization at current size

The dedup table has a current size and number of entries, displayed
above.  This percentage shows how full the dedup table is *at its
current size*.  When the dedup table is 100% full, it will double in
size, up to the limit of -D or dedup-mem.  Once it has reached this
limit, it will not increase further.  Instead, old dedup data is
removed as space is needed to record dedup data about new blocks.  So
even when the dedup table cannot be expanded, HB is still able to
dedup very well.  Dedup does not get disabled when the table is full,
though some blocks that might have been deduped with a larger table
would have to be saved again.  The backup program displays a better
statistic for the dedup table (% of maximum).  That cannot be
displayed here because the -D option can change on every backup.

             724,139 files

Files and directories stored in the backup.

             671,701 paths

Pathnames stored in the backup.

          10,637,492 blocks

Blocks stored in the backup.  If a block is referenced by more than
one file because of dedup, it is only counted once.

         127,916,102 block references

Blocks referenced in the backup.  If a block is referenced by more
than one file because of dedup, each reference is counted.

         117,278,610 deduped blocks

Blocks not stored again because they were already in the backup and
found by the dedup feature.  Multiplying this number by the average
block size shows how much backup space was saved by dedup.

              12.0:1 block dedup ratio

Ratio of blocks referenced to unique blocks stored in the backup.
This gives a more reasonable dedup ratio than the "industry standard
dedup ratio", because it is based on data that is stored during
backups rather than all file data, including unmodified files.
Skipping an unmodified file is not traditionally considered dedup.

              5.2 KB average stored block size

Average size of a block as stored in archive files.  This is after
compression, so is usually different than the backup block size, which
is a read block size.  Computed by dividing the total archive space by
the number of unique blocks.

              615 GB backup space saved by dedup

Estimated backup space saved by dedup, computed as average stored
block size times deduped blocks.

              49,647 average variable-block length

HB supports both variable and fixed-length blocks.  This is the
average length of all variable-length blocks in the backup before
compression.  Since it is data dependent, it may differ between
backups.