Stats
Displays statistics about a backup.
$ hb stats [-c backupdir] [-v] [--xmeta [size]]
Options
-v
displays notes to explain each statistic in detail.
--xmeta [size]
displays files with extended
metadata larger than size
; the default for size
is 100K. This is
useful if regular stats
shows that a large percentage of database
space is being used by extended metadata, which includes ACLs,
symbolic links, and extended attributes. Some systems such as OSX can
store large amounts of data in extended attributes, from 100K to
several MB, and often these files are not important for backups. This
option helps identify these files so they can be added to inex.conf
and excluded.
Example
This is from the HB development server, an OSX system with 5
or 6 virtual machine images of 5GB each, 6 or 7 1GB virtual disk
images, and the usual OSX system software. This system is backed up
every night since June of 2011 and retain is run every night after the
backup with -s30d12m
(keep the last 30 days of backups plus 1 backup
each month), so there are 42 versions of all VM images.
Scroll right to see the full text and footnotes
$ hb stats -c /hbbackup
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup
3,671 backups
338 TB file bytes checked since initial backup
81 TB file bytes saved since initial backup
923 total backup hours
73 GB average file bytes checked per backup in last 5 backups
23 GB average file bytes saved per backup in last 5 backups
32.32% average changed data percentage per backup in last 5 backups (1)
13m 38s average backup time for last 5 backups
654 archives
56 GB archive space
84.56% archive space utilization 47 GB
814:1 industry standard dedup ratio
79 MB average archive space per backup for last 5 backups
301:1 reduction ratio of backed up files for last 5 backups
234 MB dedup table current size
9,873,443 dedup table entries
50% dedup table utilization at current size
725,045 files
672,124 paths
7,867 extended metadata entries (2)
1.2 GB database size
70 MB space used by files (6%)
36 MB space used by paths (3%)
1.0 GB space used by blocks (87%)
1.5 MB space used by extended metadata (0%)
461 KB largest extended metadata entry
757 MB database space utilization (64%)
10,981,226 blocks
128,204,884 block references
117,223,658 deduped blocks
11.7:1 block dedup ratio (3)
5.1 KB average stored block size (4)
599 GB backup space saved by dedup
49 KB average variable-block length
1 | the change rate is high because if anything in the VM image changes, the whole file is counted as changed. |
2 | --xmeta shows details for extended metadata |
3 | without dedup this backup would be around 12x larger |
4 | for VM images to dedup well, a very small 4K block size must be used. Since this backup is mostly VM images, the average block size is 5.1K |
-v Example
$ hb stats -c /hbbackup -v
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup
3,671 backups
The number of backups stored. Removed versions not counted.
338 TB file bytes checked since initial backup
Total bytes that have been presented to the backup program, also
sometimes called "bytes in" in other backup programs. Removed
versions are not counted. For example, if a directory is saved that
contains a 100K file x, bytes checked would be 100K for version 0. If
a 200K file y is added and the directory is backed up again, bytes
checked would be 300K for version 1. Bytes checked since initial
backup is the sum of bytes checked for each version, or 400K in this
example. Another way to look at this is the sum of file sizes for all
backups if each had been a full backup. This is used to compute the
"industry standard dedup ratio" below.
81 TB file bytes saved since initial backup
Total bytes saved by the backup program, beginning with the first
backup. Removed versions are not counted. Another way to look at
this is the total bytes read from the filesystem for all backups. For
the previous example, if 100K file x was backed up in version 0, then
200K file y was added and backed up in version 1 (x was not modified),
file bytes saved since initial backup would be 300K.
923 total backup hours
Time spent on all backups.
73 GB average file bytes checked per backup in last 5 backups
Average bytes presented to the backup program per backup in the last 5
backups.
23 GB average file bytes saved per backup in last 5 backups
Average bytes saved by the backup program per backup in the last 5
backups. Some of these bytes may no longer be stored if files have
been removed. For the previous example, file bytes saved is 100K for
version 0, 200K for version 1, and the average would be 150K.
32.32% average changed data percentage per backup in last 5 backups
Average percentage of data that was changed (and saved) by the backup
program per backup in the last 5 backups.
13m 38s average backup time for last 5 backups
Average backup time for the most recent 5 backups.
654 archives
The number of arc files. Arc files may all be local, all remote, or
some combination of local and remote. Arc files are only counted
once, whether local, remote, or on several remotes.
56 GB archive space
Bytes used by all committed arc files. Some of these bytes may be
inactive (deleted) if files have been removed or retain has been run
and the archive has not been packed. This is also the sum of ls -l
for all committed archive files, both local and or remote. It
includes deleted bytes and encryption overhead required for each
block. There may be archive files present in the backup directory
that have not been committed; these will be deleted at the beginning
of the next backup.
84.56% archive space utilization 47 GB
Shows how tightly "packed" the archive files are: higher percentages
indicate better space utilization. 100% means there are few/no deleted
blocks in the archive. If the percentage is low, indicating there is
a lot of free / unused space in the archives, and you want to recover
it, you could increase the pack-percent-free config variable to cause
archives to be packed sooner. However, packing archives more often
will take more time and packed archives have to be uploaded again. To
pack a remote archive, it must be downloaded, packed, and then
uploaded; so there is a trade-off between disk space savings and
download + upload bandwidth costs + processing cost. The default for
pack-percent-free is 50: archives are packed when they have 50% or
more free space.
814:1 industry standard dedup ratio
This "dedup" figure is displayed to allow comparing HB with other
backup programs that compute their dedup ratio as the sum(bytesin) /
sum(bytesout) for all backups. Since compression and traditional
skipping of unmodified files is also counted here as dedup, this ratio
is not a good measure of actual dedup processes. This ratio is
computed as if every backup was a full backup, so even traditional
backup programs that only skip unmodified files and have no
compression nor dedup (plain incremental backups) could show a
reasonably high dedup ratio. It is computed as the sum of file bytes
checked / backup bytes written for every version.
79 MB average archive space per backup for last 5 backups
Average bytes per backup used by arc files in the last 5 backups. It
includes active and inactive data and encryption overhead. Another
way to look at this is the sum of the ls -l size of all archive files
in recent backups. This can be used for backup space planning.
301:1 reduction ratio of backed up files for last 5 backups
Shows the average space reduction for modified files in the last 5
backups. It can be used for backup space planning and is computed as
avg file bytes stored / avg arc space. If you are planning to add N
bytes of file system space, you can divide N by the ratio to estimate
how much backup space will be required. This assumes the new data is
similar to existing backup data, ie, a similar number of bytes are
modified before each backup, the data compresses the same, and dedups
the same.
234 MB dedup table current size
The maximum size of the dedup table is set with either the config
variable dedup-mem or the -D backup option. HB does not create a
maximum size dedup table until it is required. The size shown here is
the current size of the dedup table, which is often less than the
maximum size.
9,873,443 dedup table entries
The dedup table contains entries for every unique data block seen when
the backup program is used with the -D option or dedup-mem is set.
When -D0 is used, no entries are added to the dedup table.
50% dedup table utilization at current size
The dedup table has a current size and number of entries, displayed
above. This percentage shows how full the dedup table is *at its
current size*. When the dedup table is 100% full, it will double in
size, up to the limit of -D or dedup-mem. Once it has reached this
limit, it will not increase further. Instead, old dedup data is
removed as space is needed to record dedup data about new blocks. So
even when the dedup table cannot be expanded, HB is still able to
dedup very well. Dedup does not get disabled when the table is full,
though some blocks that might have been deduped with a larger table
would have to be saved again. The backup program displays a better
statistic for the dedup table (% of maximum). That cannot be
displayed here because the -D option can change on every backup.
725,045 files
Files and directories stored in the backup.
672,124 paths
Pathnames stored in the backup.
7,867 extended metadata entries
Symbolic links, ACLs, and extended attributes stored in the backup.
1.2 GB database size
Size of the hb.db database.
70 MB space used by files (6%)
Amount of database space used by files and the percentage of total
database size.
36 MB space used by paths (3%)
Amount of database space used by pathnames and the percentage of total
database size.
1.0 GB space used by blocks (87%)
Amount of database space used by blocks and the percentage of total
database size.
1.5 MB space used by extended metadata (0%)
Amount of database space used by symbolic links, ACLs, and extended
attributes and the percentage of total database size.
461 KB largest extended metadata entry
Size of the largest symbolic link, ACL, or extended attribute.
757 MB database space utilization (64%)
Amount of database space used by backup data (the rest is free space)
and the percentage of total database size.
10,981,226 blocks
Blocks stored in the backup. If a block is referenced by more than
one file because of dedup, it is only counted once.
128,204,884 block references
Blocks referenced in the backup. If a block is referenced by more
than one file because of dedup, each reference is counted.
117,223,658 deduped blocks
Blocks not stored again because they were already in the backup and
found by the dedup feature. Multiplying this number by the average
block size shows how much backup space was saved by dedup.
11.7:1 block dedup ratio
Ratio of blocks referenced to unique blocks stored in the backup.
This gives a more reasonable dedup ratio than the "industry standard
dedup ratio", because it is based on data that is stored during
backups rather than all file data, including unmodified files.
Skipping an unmodified file is not traditionally considered dedup.
5.1 KB average stored block size
Average size of a block as stored in archive files. This is after
compression, so is usually different than the backup block size, which
is a read block size. Computed by dividing the total archive space by
the number of unique blocks.
599 GB backup space saved by dedup
Estimated backup space saved by dedup, computed as average stored
block size times deduped blocks.
49 KB average variable-block length
HB supports both variable and fixed-length blocks. This is the
average length of all variable-length blocks in the backup before
compression. Since it is data dependent, it may differ between
backups.
--xmeta Example
$ hb stats -c /hbbackup --xmeta
HashBackup #2677 Copyright 2009-2022 HashBackup, LLC
Backup directory: /hbbackup
Showing files with more than 100 KB extended metadata
Metadata is compressed so size may differ from filesystem
Columns are: cummulative space %, space for this path, pathname
35% 461 KB /Developer/Applications/Performance Tools/ObjectAlloc.tracetemplate
59% 313 KB /Developer/Applications/Performance Tools/Sampler.tracetemplate
The remaining extended metadata space is used by files with smaller metadata.