NAME
App::UniqFiles - Report or omit duplicate file contents
VERSION
version 0.01
SYNOPSIS
# See uniq-files script
DESCRIPTION
FUNCTIONS
None are exported, but they are exportable.
uniq_files(%args) -> [STATUS_CODE, ERR_MSG, RESULT]
Report or omit duplicate file contents.
Given a list of filenames, will check each file size and content for duplicate content. Interface is a bit like the `uniq` Unix command-line program.
Returns a 3-element arrayref. STATUS_CODE is 200 on success, or an error code between 3xx-5xx (just like in HTTP). ERR_MSG is a string containing error message, RESULT is the actual result.
Arguments (*
denotes required arguments):
files* => array
count => bool (default
0
)Return each file content's number of occurence.
1 means the file content is only encountered once (unique), 2 means there is one duplicate, and so on.
report_duplicate => bool (default
0
)Aliases: d (Alias for --noreport-unique --report-duplicate)
Return duplicate items.
report_unique => bool (default
1
)Aliases: u (Alias for --report-unique --noreport-duplicate)
Return unique items.
TODO
Handle symlinks
Provide options on how to handle symlinks: ignore them? Follow?
Handle special files (socket, pipe, device)
Ignore them.
Check hardlinks/inodes first
For fast checking.
Arguments hash_skip_bytes & hash_bytes
For only checking uniqueness against parts of contents.
Arguments hash_module/hash_method/hash_sub
For doing custom hashing instead of Digest::MD5.
AUTHOR
Steven Haryanto <stevenharyanto@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2011 by Steven Haryanto.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.