NAME
File::Info - Store file information persistently for fast lookup
SYNOPSIS
use File::Info qw( $PACKAGE $VERSION );
my $info = File::Info->new($dir);
# $fn is "basename"; contains no directory portion
my $hex = $info->md5hex($fn); # Reads cached data if possible
DESCRIPTION
This package stores per-file information for speedy lookup later. It is intended to store file info that takes a significant time to determine --- e.g., the MD5 sum of a large file, to avoid uneccessarily recalculation. This may be particularly helpful for searching across many files for some specific property.
File statistics are recalculated on demand. If the file size or modification time have changed since the calculations were last made, then they will be purged and recalculated.
File information is stored on a per-directory basis. Each file info file is stored in a directory; the files to which it refers are in the same directory, and are referred as names without paths.
CLASS CONSTANTS
TYPE_CONSTANTS
As returned by the type method. These constants are exported by request, either individually, or together with the ':types' tag.
- TYPE_UNKNOWN
-
File type not identified
- TYPE_JPEG
-
A 'JPEG' image file.
- TYPE_PAR
-
A 'par' (parity archive) file.
CLASS COMPONENTS
CLASS HIGHER-LEVEL FUNCTIONS
CLASS HIGHER-LEVEL PROCEDURES
add_global_lookup
Add a lookup function to the. A method with the same name will be created, to provide the cached lookup.
- ARGUMENTS
-
- name
-
The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.
builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.
These other names are reserved:
add_local_lookup add_global_lookup isa import new dirname
- code
-
The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.
INSTANCE CONSTRUCTION
new
Create & return a new thing.
INSTANCE COMPONENTS
INSTANCE HIGHER-LEVEL FUNCTIONS
dirname
The name of the directory to which this instance refers
STANDARD LOOKUPS
Each of the following functions takes a filename (without path, relative to the directory of the instance), and returns the relevant value for the file.
Alternatively, they may be called as class methods, in which case the filename value must be absolute. This mode will never invoke a local method (see add_local_lookup, and is less efficient if multiple lookups are made on files in the same directory.
md5_hex
The MD5 signature of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.
md5
The MD5 signature of the file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.
md5_16khex
The MD5 signature of the first 16k of the file, as 16 pairs of hex characters. The Digest::MD5 module (version 2 or above) is required to be present.
md5_16k
The MD5 signature of the first 16k of the file, file, as a 16-byte binary value. The Digest::MD5 module (version 2 or above) is required to be present.
line_count
The number of lines in the file. More acurrately, the number of "\n" characters in the file (as for wc
). No attempt is made to guess the line terminator of the running system; for that would lead to inconsistent results on the same file on a (say) Samba-mounted drive accessed from both Windoze and UN*X.
type
The file type, as determined by reading the file itself. This is similar in intent to the file
command under UN*X, with the following distinctions:
The means of identification is consistent across all systems, rather than relying on a system-specific magic file
The type is returned as a constant (which happens to be a simple string), rather than having to parse the output of
file
This method only returns the basic type, not any details about versions, bitrates, sizes, etc. This is a feature. Other details may be queried elsewhere with the same module.
The file database is considerably less big. Of course, if you submit some additions, it will grow 8*).
The returned value is a TYPE_x
constant.
par_set_hash
Behaviour is defined only for files whose type is TYPE_PAR
.
This is the hash used to identify par files that belong to a single set. It is a 16-byte binary file.
par_set_hash_hex
Behaviour is defined only for files whose type is TYPE_PAR
.
As for par_set_hash, but a 16 pairs of hex characters representing the 16 bytes.
INSTANCE HIGHER-LEVEL PROCEDURES
add_local_lookup
Add a lookup function to this instance only. A method with the same name will be created, to provide the cached lookup.
This method will only work on this instance. Any other instances with their own local methods will be respected. The local method will override any global method of the same name. However, using the class interface (e.g., File::Info->local($absname)
will always invoke the global instance, if any (and fail, if not).
- ARGUMENTS
-
- name
-
The name may consist only of letters, digits, and underscore characters. The first character must be a letter, and at least one digit or lower-case must be present.
builtin names will always be lower-case. If you stick to this, then you will need to make no change if your identifier should get absorbed into the core. On the other hand, if you use some upper-case letters (e.g., StudlyCaps), then you are assured that you will never clash will internal names.
These other names are reserved:
add_local_lookup add_global_lookup isa import new dirname
- code
-
The code to call to calculate the value. The code will be passed the absolute name of the file to lookup, and is expected to return a suitable value. The value will be cached.
EXAMPLES
BUGS
REPORTING BUGS
Email the author.
AUTHOR
Martyn J. Pearce fluffy@cpan.org
COPYRIGHT
Copyright (c) 2002, 2003 Martyn J. Pearce. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SEE ALSO
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 422:
You forgot a '=back' before '=head1'
You forgot a '=back' before '=head1'