NAME

File::CachingFind - find files within cached search paths (e.g. include files)

SYNOPSIS

    use File::CachingFind;

    $includes = File::CachingFind->new(Path => ['/usr/local/include',
						'/usr/include']);
    $stdio = $includes->findFirstInPath('stdio.h');

DESCRIPTION

File::CachingFind is useful for repeated file searches within a path of directories. It caches the contents of its search and supports two different methods of fuzzy search, a normalize function and regular expressions. See the different METHODS for details.

METHODS

new - create a new File::CachingFind object
    $obj = File::CachingFind->new(Path =>
				      $reference_to_list_of_directories,
				  Normalize => $reference_to_function,
				  Filter => $regular_expression,
				  NoSoftlinks => $true_or_false);

Example:

    $win32_includes =
	File::CachingFind->new
	(Path =>
	     ['.!', '/cygdrive/C/Programme/DevStudio/VC/include'],
	 Normalize => sub{lc @_},
	 Filter => '\.h$');

This is the constructor for a cache to the filenames of one or more directories. It has one mandatory and three optional parameters. The cache build is a hash using the normalized filename without any directory parts in it as a key for retrieval. Each key of course can point to one or more real, full filenames.

Path

is the mandatory parameter. It must contain a reference to list of directories. Both relative and absolute paths are possible. Normally the directory itself and all its subdirectories are cached. If the directory name is followed by (ends with) an exclamation mark, the subdirectories are ignored.

Normalize

is an optional code reference. The function referenced to must take exactly one string parameter (the filename withot its directory parts) as input and returns the string in a normalized fashion. If this result is not the empty string it's used as key for the cache (otherwise the filename is ignored). If no code reference is given, the unmodified filename is used as key for the cache.

Filter

is an optional regular expression used for caching only certain files of the directories (those matching the regular expression). If no filter is given, every file is cached.

is an optional flag telling if the caching of softlinks should be inhibited. Normally the names of ordinary files as well as the name of softlinks are cached. Set the flag to true, if this is not wanted.

findInPath - locate all files with a given (normalized) name
@list = $obj->findInPath($a_file_name);

Example:

@time_h = $includes->findInPath('time.h');

This method returns all full filenames (including the directory parts) of all files in the cache of the object, which have the same normalized filename as the parameter passed to this method. The parameter itself will be normalized as well before comparizion.

On a standard Unix system the list in aboves example should at least contain /usr/include/time.h and /usr/include/sys/time.h, provided $includes is similar to the one defined at the very beginning of this documentation.

If no file is found, an empty list is returned.

findFirstInPath - locate first file with a given (normalized) name
@list = $obj->findFirstInPath($a_file_name);

Example:

    $includes2 =
	File::CachingFind->new(Path => ['/usr/include!',
					'/usr/include/sys!']);
    $time_h = $includes2->findFirstInPath('time.h');

This method returns the first full filename (including the directory parts) of all files in the cache of the object. The search is similar to the one in the method findInPath. The function will search the cache in the order of the paths given to the constructor (new).

On a standard Unix system above example returns /usr/include/time.h. A call to $includes->findFirstInPath('time.h') (see findInPath) would return either /usr/include/time.h or /usr/include/sys/time.h (indeterministic).

If no file is found, undef is returned.

findBestInPath - locate best file with a given (normalized) name
    @list = $obj->findBestInPath($a_file_name,
				 $reference_to_comparison_function);

Example:

    $time_h =
	$includes2->findBestInPath
	    ('time.h',
	     sub{ length($_[1]) <=> length($_[0]) });

This method returns the best full filename (including the directory parts) of all files in the cache of the object. The search is similar to the one in the method findInPath. All files found are compared using the given comparision function (similar to comparision functions given to sort, except that it uses real parameters). If more than one file remains, the order of the paths given to the constructor (new) will be considered as well (as in findFirstInPath).

On a standard Unix system above example returns /usr/include/sys/time.h as it has a longer full filename than /usr/include/time.h.

If no file is found, undef is returned.

findMatch - locate all files matching a regular expression
@list = $obj->findMatch($regular_expression);

Example:

@std_h = $includes2->findMatch('^(?i:std)');

This method returns all full filenames (including the directory parts) of all files in the cache of the object, which match the given regular expression. Note, that the regular expression won't be normalized, you have to make sure that it matches the normalized filenames.

On a standard Unix system the list in aboves example should at least contain /usr/include/stdio.h and /usr/include/stdlib.h, provided $includes2 is similar to the used in prior examples. Your mileage may vary, especially on different systems. Note that the example uses a case insensitive match.

If no file is found, an empty list is returned.

findFirstMatch - locate first file matching a regular expression
@list = $obj->findFirstMatch($regular_expression);

Example:

$std_h = $includes2->findFirstMatch('^std');

This method returns the first full filename (including the directory parts) of all files in the cache of the object matching the given regular expression. It works similar to FindFirstInPath and will search the cache in the order of the paths given to the constructor (new). Thus it may be of limited use as the algorithm chosing between more than one file of the same path is indeterministic. findBestMatch would be a better choice in most circumstances though it is a bit slower most of the times.

On a standard Unix system above example returns /usr/include/stdio.h or /usr/include/stdlib.h or another matching file (indeterministic).

If no file is found, undef is returned.

findBestMatch - locate best file matching a regular expression
    @list = $obj->findBestMatch($regular_expression,
				$reference_to_comparison_function);

Example:

    $std_h =
	$includes2->findBestMatch
	    ('^std',
	     sub{ length($_[0]) <=> length($_[1]) });

This method returns the best full filename (including the directory parts) of all files in the cache of the object matching the given regular expression. As in findBestInPath all files found are compared using the given comparision function followed by the order of the paths given to the constructor (new).

On a standard Unix system above example returns /usr/include/stdio.h unless there is another include with an even shorter name beginning with /usr/include/std.

If no file is found, undef is returned.

KNOWN BUGS

Directory names ending with an exclamation mark can't be handled yet!

Softlinks creating a cyclic directory structure will cause an infinite loop.

If the same file is found more than once using different paths in the constructor (new), it will be cached more than once! This is considered a feature, not a bug.

AUTHOR

Thomas Dorner <Thomas.Dorner@start.de> or <Thomas.Dorner@gmx.de>

SEE ALSO

perl(1).