NAME
hashl - Create database with partial file hashes, check if other files are in it
SYNOPSIS
hashl [-d dbfile] [-s read-size] action [args]
VERSION
This manual documents hashl version 0.2
DESCRIPTION
Actions:
- copy newdir
-
Copy all files in the current directory which are not in the database to newdir.
- find-known [directory]
-
List all files which are already in the database. Scans either the current directory or directory.
- find-new [directory]
-
List all files which are not in the database. Scans either the current directory or directory.
- ignore [directory]
-
Add all files in directory (or the current directory) as "ignored" to the database. This means that hashl will save the file's hash and skip matching files for copy or find-new.
- info [file]
-
Show information on file (or the database, if file is not specified).
- list
-
List all files and their hashes. The list format is
hash size file
. - list-files
-
List all filenames, one file per line.
- list-ignored
-
List ignored hashes.
- update
-
Update or create hash database. Iterates over all files below the current directory.
OPTIONS
- -d|--database dbfile
-
Use dbfile instead of .hashl.db
- -f|--force
-
For use with
hashl add
: If there are ignored files in the directory, unignore and add them. - -n|--no-progress
-
Do not show progress information. Most useful with
hashl find-new
. - -s|--read-size kibibytes
-
Change size of the part of each file which is hashed. By default, hashl hashes the first 4 MiB. Note that this option only makes sense when using
hashl update
to create a new database. - -V|--version
-
Print version information.
EXIT STATUS
Unless an error occured, hashl will always return zero.
CONFIGURATION
None, so far
DEPENDENCIES
Digest::SHA
Time::Progress
BUGS AND LIMITATIONS
Unknown. This is beta software.
EXAMPLES
LEECHING
First, create a database of your local files:
cd /media/videos; hashl update
Now, assume you have a (possibly slow) external share mounted at /tmp/mnt/ext. You do not want to copy all files to your disk and then use fdupes or similar to weed out the duplicates. Since you just used hashl to create a database with the hashes of the first 4MB of all your files, you can now use it to check if you (very probably) already have any remote file. For that, you only need to leech the first 4MB of every file on the share, and not the whole file. For example:
cd /tmp/mnt/ext; hashl copy /media/videos/incoming
EXTERNAL HARD DISK
Personally, I have all my videos on an external hard disk, which I usually do not carry with me. So, when I get new videos, I put them into ~/lib/videos on my netboo, and then later copy them to the external disk. Of course, it can always happen that I get a movie I already have, or forget to move something from ~/lib/videos to the external disk, especially since I also always have some stuff from the disk in ~/lib/videos.
However, I can use hashl to conveniently solve this issue. Run periodically:
cd /media/argon; hashl -d ~/lib/video/.argon update
Now, I always have a list of files on the external disk with me. When I get a new file:
hashl -d ~/lib/video/.argon new-file $file
And to find out which files are not on the external disk:
cd ~/lib/video; print -l **/*(.) | hashl -d .argon new-file
AUTHOR
Copyright (C) 2010 by Daniel Friesel <derf@finalrewind.org>
LICENSE
0. You just DO WHAT THE FUCK YOU WANT TO.