Stats

Displays statistics about a backup.

$ hb stats [-c backupdir] [-v] [--xmeta [size]]

Options

-v displays notes to explain each statistic in detail.

--xmeta [size] displays files with extended metadata larger than size; the default for size is 100K. This is useful if regular stats shows that a large percentage of database space is being used by extended metadata, which includes ACLs, symbolic links, and extended attributes. Some systems such as OSX can store large amounts of data in extended attributes, from 100K to several MB, and often these files are not important for backups. This option helps identify these files so they can be added to inex.conf and excluded.

Example

This is from the HB development server, an OSX system with 5 or 6 virtual machine images of 5GB each, 6 or 7 1GB virtual disk images, and the usual OSX system software. This system is backed up every night since June of 2011 and retain is run every night after the backup with -s30d12m (keep the last 30 days of backups plus 1 backup each month), so there are 42 versions of all VM images.

Scroll right to see the full text and footnotes

$ hb stats -c /hbbackup
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup

               3,671 backups
              338 TB file bytes checked since initial backup
               81 TB file bytes saved since initial backup
                 923 total backup hours
               73 GB average file bytes checked per backup in last 5 backups
               23 GB average file bytes saved per backup in last 5 backups
              32.32% average changed data percentage per backup in last 5 backups (1)
             13m 38s average backup time for last 5 backups
                 654 archives
               56 GB archive space
              84.56% archive space utilization 47 GB
               814:1 industry standard dedup ratio
               79 MB average archive space per backup for last 5 backups
               301:1 reduction ratio of backed up files for last 5 backups
              234 MB dedup table current size
           9,873,443 dedup table entries
                 50% dedup table utilization at current size
             725,045 files
             672,124 paths
               7,867 extended metadata entries (2)
              1.2 GB database size
               70 MB space used by files (6%)
               36 MB space used by paths (3%)
              1.0 GB space used by blocks (87%)
              1.5 MB space used by extended metadata (0%)
              461 KB largest extended metadata entry
              757 MB database space utilization (64%)
          10,981,226 blocks
         128,204,884 block references
         117,223,658 deduped blocks
              11.7:1 block dedup ratio (3)
              5.1 KB average stored block size (4)
              599 GB backup space saved by dedup
               49 KB average variable-block length
1 the change rate is high because if anything in the VM image changes, the whole file is counted as changed.
2 --xmeta shows details for extended metadata
3 without dedup this backup would be around 12x larger
4 for VM images to dedup well, a very small 4K block size must be used. Since this backup is mostly VM images, the average block size is 5.1K

-v Example

$ hb stats -c /hbbackup -v
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup

               3,671 backups

The number of backups stored.  Removed versions not counted.

              338 TB file bytes checked since initial backup

Total bytes that have been presented to the backup program, also
sometimes called "bytes in" in other backup programs.  Removed
versions are not counted.  For example, if a directory is saved that
contains a 100K file x, bytes checked would be 100K for version 0.  If
a 200K file y is added and the directory is backed up again, bytes
checked would be 300K for version 1.  Bytes checked since initial
backup is the sum of bytes checked for each version, or 400K in this
example.  Another way to look at this is the sum of file sizes for all
backups if each had been a full backup.  This is used to compute the
"industry standard dedup ratio" below.

               81 TB file bytes saved since initial backup

Total bytes saved by the backup program, beginning with the first
backup.  Removed versions are not counted.  Another way to look at
this is the total bytes read from the filesystem for all backups.  For
the previous example, if 100K file x was backed up in version 0, then
200K file y was added and backed up in version 1 (x was not modified),
file bytes saved since initial backup would be 300K.

                 923 total backup hours

Time spent on all backups.

               73 GB average file bytes checked per backup in last 5 backups

Average bytes presented to the backup program per backup in the last 5
backups.

               23 GB average file bytes saved per backup in last 5 backups

Average bytes saved by the backup program per backup in the last 5
backups.  Some of these bytes may no longer be stored if files have
been removed.  For the previous example, file bytes saved is 100K for
version 0, 200K for version 1, and the average would be 150K.

              32.32% average changed data percentage per backup in last 5 backups

Average percentage of data that was changed (and saved) by the backup
program per backup in the last 5 backups.

             13m 38s average backup time for last 5 backups

Average backup time for the most recent 5 backups.

                 654 archives

The number of arc files.  Arc files may all be local, all remote, or
some combination of local and remote.  Arc files are only counted
once, whether local, remote, or on several remotes.

               56 GB archive space

Bytes used by all committed arc files.  Some of these bytes may be
inactive (deleted) if files have been removed or retain has been run
and the archive has not been packed.  This is also the sum of ls -l
for all committed archive files, both local and or remote.  It
includes deleted bytes and encryption overhead required for each
block.  There may be archive files present in the backup directory
that have not been committed; these will be deleted at the beginning
of the next backup.

              84.56% archive space utilization 47 GB

Shows how tightly "packed" the archive files are: higher percentages
indicate better space utilization. 100% means there are few/no deleted
blocks in the archive.  If the percentage is low, indicating there is
a lot of free / unused space in the archives, and you want to recover
it, you could increase the pack-percent-free config variable to cause
archives to be packed sooner.  However, packing archives more often
will take more time and packed archives have to be uploaded again.  To
pack a remote archive, it must be downloaded, packed, and then
uploaded; so there is a trade-off between disk space savings and
download + upload bandwidth costs + processing cost.  The default for
pack-percent-free is 50: archives are packed when they have 50% or
more free space.

               814:1 industry standard dedup ratio

This "dedup" figure is displayed to allow comparing HB with other
backup programs that compute their dedup ratio as the sum(bytesin) /
sum(bytesout) for all backups.  Since compression and traditional
skipping of unmodified files is also counted here as dedup, this ratio
is not a good measure of actual dedup processes.  This ratio is
computed as if every backup was a full backup, so even traditional
backup programs that only skip unmodified files and have no
compression nor dedup (plain incremental backups) could show a
reasonably high dedup ratio.  It is computed as the sum of file bytes
checked / backup bytes written for every version.

               79 MB average archive space per backup for last 5 backups

Average bytes per backup used by arc files in the last 5 backups.  It
includes active and inactive data and encryption overhead.  Another
way to look at this is the sum of the ls -l size of all archive files
in recent backups.  This can be used for backup space planning.

               301:1 reduction ratio of backed up files for last 5 backups

Shows the average space reduction for modified files in the last 5
backups.  It can be used for backup space planning and is computed as
avg file bytes stored / avg arc space.  If you are planning to add N
bytes of file system space, you can divide N by the ratio to estimate
how much backup space will be required.  This assumes the new data is
similar to existing backup data, ie, a similar number of bytes are
modified before each backup, the data compresses the same, and dedups
the same.

              234 MB dedup table current size

The maximum size of the dedup table is set with either the config
variable dedup-mem or the -D backup option.  HB does not create a
maximum size dedup table until it is required.  The size shown here is
the current size of the dedup table, which is often less than the
maximum size.

           9,873,443 dedup table entries

The dedup table contains entries for every unique data block seen when
the backup program is used with the -D option or dedup-mem is set.
When -D0 is used, no entries are added to the dedup table.

                 50% dedup table utilization at current size

The dedup table has a current size and number of entries, displayed
above.  This percentage shows how full the dedup table is *at its
current size*.  When the dedup table is 100% full, it will double in
size, up to the limit of -D or dedup-mem.  Once it has reached this
limit, it will not increase further.  Instead, old dedup data is
removed as space is needed to record dedup data about new blocks.  So
even when the dedup table cannot be expanded, HB is still able to
dedup very well.  Dedup does not get disabled when the table is full,
though some blocks that might have been deduped with a larger table
would have to be saved again.  The backup program displays a better
statistic for the dedup table (% of maximum).  That cannot be
displayed here because the -D option can change on every backup.

             725,045 files

Files and directories stored in the backup.

             672,124 paths

Pathnames stored in the backup.

               7,867 extended metadata entries

Symbolic links, ACLs, and extended attributes stored in the backup.

              1.2 GB database size

Size of the hb.db database.

               70 MB space used by files (6%)

Amount of database space used by files and the percentage of total
database size.

               36 MB space used by paths (3%)

Amount of database space used by pathnames and the percentage of total
database size.

              1.0 GB space used by blocks (87%)

Amount of database space used by blocks and the percentage of total
database size.

              1.5 MB space used by extended metadata (0%)

Amount of database space used by symbolic links, ACLs, and extended
attributes and the percentage of total database size.

              461 KB largest extended metadata entry

Size of the largest symbolic link, ACL, or extended attribute.

              757 MB database space utilization (64%)

Amount of database space used by backup data (the rest is free space)
and the percentage of total database size.

          10,981,226 blocks

Blocks stored in the backup.  If a block is referenced by more than
one file because of dedup, it is only counted once.

         128,204,884 block references

Blocks referenced in the backup.  If a block is referenced by more
than one file because of dedup, each reference is counted.

         117,223,658 deduped blocks

Blocks not stored again because they were already in the backup and
found by the dedup feature.  Multiplying this number by the average
block size shows how much backup space was saved by dedup.

              11.7:1 block dedup ratio

Ratio of blocks referenced to unique blocks stored in the backup.
This gives a more reasonable dedup ratio than the "industry standard
dedup ratio", because it is based on data that is stored during
backups rather than all file data, including unmodified files.
Skipping an unmodified file is not traditionally considered dedup.

              5.1 KB average stored block size

Average size of a block as stored in archive files.  This is after
compression, so is usually different than the backup block size, which
is a read block size.  Computed by dividing the total archive space by
the number of unique blocks.

              599 GB backup space saved by dedup

Estimated backup space saved by dedup, computed as average stored
block size times deduped blocks.

               49 KB average variable-block length

HB supports both variable and fixed-length blocks.  This is the
average length of all variable-length blocks in the backup before
compression.  Since it is data dependent, it may differ between
backups.

--xmeta Example

$ hb stats -c /hbbackup --xmeta
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup

Showing files with more than 100 KB extended metadata
Metadata is compressed so size may differ from filesystem
Columns are: cummulative space %, space for this path, pathname

 35%  461 KB  /Developer/Applications/Performance Tools/ObjectAlloc.tracetemplate
 59%  313 KB  /Developer/Applications/Performance Tools/Sampler.tracetemplate

The remaining extended metadata space is used by files with smaller metadata.