Count

Scans one or more directory trees and shows statistics including file counts by file type, by size, the top N files and directories by size, how the files would be divided into shards, and how file sizes are distributed.

# hb count pathname ... [-d] [-n]
                        [--random P] [--sample P]
                        [--shards N or M-N] [-t N] [-X]

At least one pathname is required. Without options, Count shows pathname statistics with counts and space usage by file type.

Options

-d shows a file size distribution table

-n disables reading file sizes and is 2-3x faster

--random P random sample a percentage P of files (by file count, not by size)

--sample P sample (like backup) a percentage P of files (by file count, not by size)

--shards N shows how files would be divided into N shards, by count and size

--shards M-N shows how files would be sharded for all values M through N

-t N shows the top N files and directories by size

-X cross filesystems

The --sample option is not a true random sample and will give the same results every time. It is designed to be compatible with backup’s --sample option, which samples in this fixed way to support incremental backup testing. Sampling is based on a hash of the filename so is not always accurate, especially if many files have the same name. For true random sampling, using the --random option.

Example

Count output from an OSX (Mac) /Applications directory. Only the first two sections would be displayed with no options:

$ hb count /Applications -d -t10 --shard 2-4
HashBackup #2619 Copyright 2009-2021 HashBackup, LLC

Paths: 68012
Avg path len: 93
Max path len: 224
Max pathname: /Applications/Carbon Copy Cloner.app/Contents/Library/LoginItems/CCC User Agent.app/Contents/XPCServices/CCC Stats Service.xpc/Contents/Frameworks/Paddle.framework/Versions/A/Resources/PADActivateWindowControllerYosemite.nib

Types:
  dir      |  7054 |
  file     | 60840 |     1 GiB
  symlink  |   118 |     2 KiB

Shard counts:
  2: 30677 50% | 30281 49%
  3: 20490 33% | 20199 33% | 20269 33%
  4: 15380 25% | 15080 24% | 15226 24% | 15272 25%

Shard sizes:
  2: 1.1 GB 58% | 806 MB 41%
  3: 581 MB 29% | 579 MB 29% | 791 MB 40%
  4: 444 MB 22% | 420 MB 21% | 681 MB 34% | 405 MB 20%

Top 10 files:
   270 MB /Applications/Firefox.app/Contents/MacOS/XUL
    44 MB /Applications/Firefox.app/Contents/Resources/browser/omni.ja
    40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma
    34 MB /Applications/iTunes.app/Contents/MacOS/iTunes
    29 MB /Applications/Firefox.app/Contents/Resources/omni.ja
    27 MB /Applications/iTunes.app/Contents/Resources/Assets.car
    25 MB /Applications/Pianoteq 7/Pianoteq 7.app/Contents/Resources/presources.dat
    23 MB /Applications/PDFScanner.app/Contents/Frameworks/libopencv_imgproc.dylib
    21 MB /Applications/PDFScanner.app/Contents/Resources/tesseract/fin.traineddata
    20 MB /Applications/Photos.app/Contents/MacOS/Photos

Top 10 dirs:
   291 MB /Applications/Firefox.app/Contents/MacOS
   136 MB /Applications/Kofax Power PDF for Mac.app/Contents/PlugIns/PDFpenOCR.app/Contents/Frameworks/Nuance-OmniPage-CSDK-RunTime.framework/Versions/A/Resources
    92 MB /Applications/PDFScanner.app/Contents/Frameworks
    70 MB /Applications/PDFScanner.app/Contents/Resources/tesseract
    61 MB /Applications/iTunes.app/Contents/Resources
    50 MB /Applications/Kofax Power PDF for Mac.app/Contents/PlugIns/PDFpenOCR.app/Contents/Frameworks/Nuance-OmniPage-CSDK-RunTime.framework/Versions/A/Libraries
    44 MB /Applications/Firefox.app/Contents/Resources/browser
    40 MB /Applications/Install Pianoteq 7.app/Contents/Resources
    34 MB /Applications/iTunes.app/Contents/MacOS
    31 MB /Applications/Firefox.app/Contents/Resources

File size distribution:
Cumm Count |  Pct | FILE COUNT |~Cumm Count | ~Pct | FILE SIZE | Cumm Space |  Pct | SPACE USED |~Cumm Space | ~Pct
-----------+------+------------+------------+------+-----------+------------+------+------------+------------+-----
[1]  35725 |  58% |      35725 |      60958 | 100% |    <1 KiB |     11 MiB |   0% |     11 MiB |      1 GiB | 100%
     42029 |  68% |       6304 |      25233 |  41% |     1 KiB |     20 MiB |   1% |      8 MiB |      1 GiB |  99%
     46778 |  76% |       4749 |      18929 |  31% |     2 KiB |     33 MiB |   1% |     13 MiB |      1 GiB |  98%
[3]  50426 |  82% |       3648 |      14180 |  23% |     4 KiB |     53 MiB |   2% |     20 MiB |      1 GiB |  98%
     54346 |  89% |       3920 |      10532 |  17% |     8 KiB |     96 MiB |   5% |     43 MiB |      1 GiB |  97%
     57155 |  93% |       2809 |       6612 |  10% |    16 KiB |    157 MiB |   8% |     61 MiB |      1 GiB |  94%
     58964 |  96% |       1809 |       3803 |   6% |    32 KiB |    236 MiB |  12% |     78 MiB |      1 GiB |  91%
     59547 |  97% |        583 |       1994 |   3% |    64 KiB |    285 MiB |  15% |     48 MiB |      1 GiB |  87%
     60055 |  98% |        508 |       1411 |   2% |   128 KiB |    372 MiB |  20% |     87 MiB |      1 GiB |  84%
     60463 |  99% |        408 |        903 |   1% |   256 KiB |    528 MiB |  28% |    155 MiB |      1 GiB |  79%
     60724 |  99% |        261 |        495 |   0% |   512 KiB |    713 MiB |  38% |    184 MiB |      1 GiB |  71%
     60861 |  99% |        137 |        234 |   0% |     1 MiB |    894 MiB |  48% |    181 MiB |      1 GiB |  61%
     60902 |  99% |         41 |         97 |   0% |     2 MiB |   1017 MiB |  54% |    123 MiB |    967 MiB |  51%
[2]  60936 |  99% |         34 |         56 |   0% |     4 MiB |      1 GiB |  64% |    186 MiB |    843 MiB |  45%
     60945 |  99% |          9 |         22 |   0% |     8 MiB |      1 GiB |  69% |     90 MiB |    657 MiB |  35%
     60954 |  99% |          9 |         13 |   0% |    16 MiB |      1 GiB |  80% |    195 MiB |    567 MiB |  30%
     60957 |  99% |          3 |          4 |   0% |    32 MiB |      1 GiB |  86% |    113 MiB |    371 MiB |  19%
     60957 |  99% |            |          1 |   0% |    64 MiB |      1 GiB |  86% |            |    257 MiB |  13%
     60957 |  99% |            |          1 |   0% |   128 MiB |      1 GiB |  86% |            |    257 MiB |  13%
[4]  60958 | 100% |          1 |          1 |   0% |   256 MiB |      1 GiB | 100% |    257 MiB |    257 MiB |  13%

68012 names in 3s, 17414 names/s

The file size distribution table is easier to understand by focusing on the 3 main columns:

  • FILE SIZE (col 6) is a range of file sizes, from this size to the next size

  • FILE COUNT (col 3) is the number of files that are in this size range

  • SPACE USED (col 9) is the total bytes used by all files in this size range

To the left and right of FILE COUNT and SPACE USED are cumulative counts and percentages. The left set of counts and percentages accumulate from the top down, while the right set, marked with a leading ~ in the title, accumulate from the bottom up.

Some observations about /Applications from this table:

[1] from line 1 col 2 (Pct), 58% of files are less than 1K; the (Cumm Space) column shows that altogether they occupy around 11MiB

[2] from line 14 col 4 (~Cumm Count), only 56 files are 4MB or larger; from the last column (~Pct), they occupy 45% of /Applications

[3] in line 4, the 1st column (Cumm Count) shows there are 50,426 files less than 8K (the FILE SIZE in the next line); from col 2 (Pct), this is 82% of the total file count, but only 2% (Pct column after FILE SIZE) of the space, at 53 MiB (Cumm Space)

[4] the last line shows there is 1 file (FILE COUNT) over 256 MiB (FILE SIZE), it uses 257 MiB (SPACE USED), and is 13% (~Pct in the last column) of the space in /Applications