Finds files and directories with large amounts of unshared data and either generates a report or starts an interactive session to delete files and reduce the backup size.
$ hb trim [-c backupdir] [-i] [-p] [-n toppercent] [-s skipfilesize]
Without options, trim generates a report with 3 sections:
versions using more than average backup space
files using more than average backup space
directories using more than average backup space
-i starts an interactive session to remove files and/or exclude
them from the backup. Without
-i trim generates a report ordered by
files and directories using the most backup space.
-n toppercent percentage of paths to check, default is 50 for the top 50%
by file size. Trim starts faster with lower percentages.
-p sorts by pathname instead of backup space
-s skipfilesize ignores files smaller than or with less unique
-s10g skips files < 10 GB,
-s1 is the
-s1m and skips all files < 1 MB. The default is to skip
files below the average file size (or 1 MB, whichever is larger).
Trim starts faster when
-s is larger.
Trim first scans the entire backup database to find pathnames that have the most unshared data. This is tricky in a deduplicated backup, because different pathnames may reference common data. Deleting any one pathname will not reduce backup space because that space is still used by the other pathnames.
Trim shows versions, files, and directories using the most unshared
backup space. This is a hard problem even for trim! Instead of
showing filesystem file sizes seen in a Unix
ls list, trim shows how
much backup space files are using after deduplication and compression.
Trim only reports the unshared space used by individual pathnames. If space is shared between different pathnames the shared space is not counted because freeing it would require deleting multiple files.
The directory section shows how much unshared backup space each directory is using based on the sum of unshared space used by large files in the directory. Small files are not counted, so the actual space used by the directory is larger than what trim shows.
Sites or individual users with more control over their backup data can
-i option to start an interactive trim session. This
calculates space usage as described in the Trim Reporting section,
then allows entering single-keystroke commands to manipulate the
largest files in the backup.
With interactive trim, the list of files and directories is easily navigated and files and directories can be marked with several states:
d= delete from backup
D= delete from backup and the live filesystem
x= delete from backup, add to inex.conf so file isn’t saved again
X= delete from backup and the live filesystem, then add to inex.conf
Files and directories marked "keep" have this recorded in the backup database so it is remembered the next time interactive trim is used. This can be overridden of course.
To make trimming faster, an autoskip feature can be enabled and
disabled with the
a keystroke. This will show and skip files
already marked as keep or delete.
After finishing the interactive trim session with the
f key, trim
will ask for confirmation to delete items, then ask again before
committing deletes to the backup database. After the second
confirmation, deletes are permanent.
First a test directory is created with a few files.
# file1 is a 10MB file of random data (use bs=10M, capital M, on Linux) $ dd if=/dev/random of=test/file1 bs=10m count=1 1+0 records in 1+0 records out 10485760 bytes transferred in 0.320128 secs (32754902 bytes/sec)
# file2 is a 5MB file of random data $ dd if=/dev/random of=test/file2 bs=5m count=1 1+0 records in 1+0 records out 5242880 bytes transferred in 0.157587 secs (33269789 bytes/sec)
# file12 is file1 + file2 $ cat test/file1 test/file2 >test/file12
# file1copy is a copy of file1 $ cp test/file1 test/file1copy
Let’s see the test directory:
$ ls -l test total 81920 -rw-r--r-- 1 jim staff 10485760 Apr 2 17:20 file1 -rw-r--r-- 1 jim staff 15728640 Apr 2 17:21 file12 -rw-r--r-- 1 jim staff 10485760 Apr 2 17:21 file1copy -rw-r--r-- 1 jim staff 5242880 Apr 2 17:20 file2
Now create a backup of the test directory:
$ hb init -c hb HashBackup #2876 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Permissions set for owner access only Created key file /hb/key.conf Key file set to read-only Setting include/exclude defaults: /hb/inex.conf VERY IMPORTANT: your backup is encrypted and can only be accessed with the encryption key, stored in the file: /hb/key.conf You MUST make copies of this file and store them in secure locations, separate from your computer and backup data. If your hard drive fails, you will need this key to restore your files. If you have setup remote destinations in dest.conf, that file should be copied too. Backup directory initialized $ hb backup -c hb test HashBackup #2876 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Backup start: 2022-04-02 17:21:43 Copied HB program to /hb/hb#2876 This is backup version: 0 Dedup enabled, 0% of current size, 0% of max size / /hb /hb/inex.conf /test /test/file1 /test/file12 /test/file1copy /test/file2 Time: 0.9s CPU: 0.5s, 49% Mem: 78 MB Checked: 8 paths, 41943562 bytes, 41 MB Saved: 8 paths, 41943562 bytes, 41 MB Excluded: 0 Dupbytes: 25990241, 25 MB, 61% Compression: 61%, 2.6:1 Efficiency: 53.33 MB reduced/cpusec Space: +15 MB, 16 MB total New files using the most space: 10 MB /test/file1 5.3 MB /test/file12 No errors
This shows that even though we have 40MB of data in the test
directory, only 15MB is using backup space. Here’s how
this in a report:
$ hb trim -c hb HashBackup #2882 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Most recent backup version: 0 Dedup loaded, 0% of current size Skipping files below average or 1MB Versions: 15 MB 0 Scanning files I 5 files, average file size 8.3 MB, total 41 MB 3 files above 8.3 MB total 36 MB Scanning files II 0 files total 0 bytes unique bytes $
Trim has figured out that these files are all sharing data, so removing any one of them will not reduce backup space.
The next example is a backup of /Applications on Mac OSX.
$ hb backup -c hb /Applications -v1 HashBackup #2882 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Backup start: 2022-04-03 21:14:06 Copied HB program to /hb/hb#2882 This is backup version: 0 Dedup enabled, 0% of current size, 0% of max size Backing up: /Applications Backing up: /hb/inex.conf00 checked 1.7 GB 1325/sec Time: 51.7s CPU: 59.0s, 114% Mem: 107 MB Checked: 68014 paths, 1939557240 bytes, 1.9 GB Saved: 68014 paths, 1939555008 bytes, 1.9 GB Excluded: 0 Dupbytes: 197630484, 197 MB, 10% Compression: 53%, 2.2:1 Efficiency: 16.77 MB reduced/cpusec Space: +901 MB, 901 MB total New files using the most space: 101 MB /Applications/Firefox.app/Contents/MacOS/XUL 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma 24 MB /Applications/iTunes.app/Contents/Resources/Assets.car 18 MB /Applications/Pianoteq 7/Pianoteq 7.app/Contents/Resources/presources.dat 14 MB /Applications/iTunes.app/Contents/MacOS/iTunes 12 MB /Applications/Firefox.app/Contents/Resources/browser/omni.ja 11 MB /Applications/PDFScanner.app/Contents/Frameworks/libopencv_imgproc.dylib 9.7 MB /Applications/Firefox.app/Contents/Resources/omni.ja 9.2 MB /Applications/Pianoteq 7/Pianoteq 7.app/Contents/MacOS/Pianoteq 7 9.1 MB /Applications/PDFScanner.app/Contents/Resources/tesseract/fin.traineddata No errors $
The trim report is:
HashBackup #2882 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Most recent backup version: 0 Dedup loaded, 13% of current size Skipping files below average or 1MB Versions: 900 MB 0 Scanning files I 60773 files, average file size 31 KB, total 1.9 GB 243 files above 1.0 MB total 1.2 GB Scanning files II 71 files total 260 MB unique bytes[K Files 43 MB /Applications/Firefox.app/Contents/MacOS/XUL 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma 21 MB /Applications/iTunes.app/Contents/Resources/Assets.car 15 MB /Applications/Pianoteq 7/Pianoteq 7.app/Contents/Resources/presources.dat 6.4 MB /Applications/iTunes.app/Contents/MacOS/iTunes 6.1 MB /Applications/Firefox.app/Contents/Resources/browser/omni.ja 5.2 MB /Applications/Safari.app/Contents/Resources/Background Images/Safari-Background_Emoji_Safari-Animals.699.png 4.8 MB /Applications/Firefox.app/Contents/Resources/omni.ja 4.5 MB /Applications/PDFScanner.app/Contents/Frameworks/libopencv_imgproc.dylib 4.3 MB /Applications/Safari.app/Contents/Resources/Background Images/Safari-Background_Emoji_Vacation.609.png 4.1 MB /Applications/PDFScanner.app/Contents/Resources/tesseract/fin.traineddata 3.9 MB /Applications/PDFScanner.app/Contents/Frameworks/libopencv_core.dylib ... 1.0 MB /Applications/Books.app/Contents/PlugIns/iBAReaderKit.bundle/Contents/MacOS/iBAReaderKit 1.0 MB /Applications/Safari.app/Contents/Resources/Background Images/Safari-Background_California-Dogface-Butterfly.661.png Directories 260 MB /Applications 56 MB /Applications/Firefox.app 56 MB /Applications/Firefox.app/Contents 45 MB /Applications/Firefox.app/Contents/MacOS 40 MB /Applications/Install Pianoteq 7.app 40 MB /Applications/Install Pianoteq 7.app/Contents 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources ...
This report (usually much longer!) can be analyzed offline or given to
storage users to figure out which files can be easily removed to save
backup space. The
-p option may be useful for reporting to group
files into the normal directory structure alphabetically.
Here’s a short example of interactive trimming. The keystroke has been added in parenthesis to help understand what’s happening.
$ hb trim -c /hb -i HashBackup #2883 Copyright 2009-2022 HashBackup, LLC Backup directory: /hb Most recent backup version: 0 Dedup loaded, 13% of current size Skipping files below average or 1MB Versions: 899 MB 0 Scanning files I 60773 files, average file size 31 KB, total 1.9 GB 243 files above 1.0 MB total 1.2 GB Scanning files II 116 files total 498 MB unique bytes ---------------------------------------------------------------- Navigation n = no change, next (Enter) b = no change, back (Backspace) < = up a directory > = down a directory 123 Enter = goto line 123 a = autoskip toggle Space = next or prev 25 Changes k = keep file K = keep everything in directory d = delete backup D = delete backup & live file x = delete backup, exclude X = delete backup & live, exclude - = remove keep or delete / = toggle directory slash Other l = list directory s = show deletes h = help f = finished q, ctrl-c, ctrl-d = quit without doing any deletes. IMPORTANT: nothing is changed without confirmation. ---------------------------------------------------------------- Files 1. 101 MB /Applications/Firefox.app/Contents/MacOS/XUL (n) next 2. 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma (n) next 3. 25 MB /Applications/iTunes.app/Contents/Resources/Assets.car (b) back 2. 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma (<) up Directories CAREFUL! Deleting live directories will delete all contents, including files not in the backup. 135. 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources (<) up 134. 40 MB /Applications/Install Pianoteq 7.app/Contents (<) up 133. 40 MB /Applications/Install Pianoteq 7.app (d) delete backup Files 2. 40 MB /Applications/Install Pianoteq 7.app/Contents/Resources/Install Pianoteq 7.pkg.lzma parent Install Pianoteq 7.app deleted (n) next 3. 25 MB /Applications/iTunes.app/Contents/Resources/Assets.car (n) next 4. 18 MB /Applications/Pianoteq 7/Pianoteq 7.app/Contents/Resources/presources.dat (f) finish Directories 117. 498 MB /Applications (n) next 118. 126 MB /Applications/Firefox.app (n) next 119. 126 MB /Applications/Firefox.app/Contents (n) next 120. 104 MB /Applications/Firefox.app/Contents/MacOS (n) next 121. 71 MB /Applications/Kofax Power PDF for Mac.app (d) delete backup 122. 71 MB /Applications/Kofax Power PDF for Mac.app/Contents parent Kofax Power PDF for Mac.app deleted (f) finish Files to be deleted: 121. 71 MB /Applications/Kofax Power PDF for Mac.app delete backup 133. 40 MB /Applications/Install Pianoteq 7.app delete backup -> 111 MB backup space to delete Delete these? y Remove backup: /Applications/Kofax Power PDF for Mac.app Remove backup: /Applications/Install Pianoteq 7.app Commit deletes? y Packing archives Packing arc.0.2 into arc.0.9 Packing arc.0.3 into arc.0.9 Packing arc.0.4 into arc.0.9 Mem: 75 MB Removed: 289 MB, 1668 files, 2 arc files Space: -167 MB, 748 MB total $
The backup space removed is higher than reported by
trim only considers large files for interactive trimming. When a
directory is removed, all files are removed.
HashBackup manages backup space by periodically packing arc files to
remove empty space. Depending on how the
pack- config keywords are
setup, backup space may not be removed immediately following a trim,
but will be removed the next time arc files are packed.