NAME
File::Finder - nice wrapper for File::Find ala find(1)
SYNOPSIS
use File::Finder;
## simulate "-type f"
my $all_files = File::Finder->type('f');
## any rule can be extended:
my $all_files_printer = $all_files->print;
## traditional use: generating "wanted" subroutines:
use File::Find;
find($all_files_printer, @starting_points);
## or, we can gather up the results immediately:
my @results = $all_files->in(@starting_points);
## -depth and -follow are noted, but need a bit of help for find:
my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}');
find($deep_dirs->as_options, @places);
DESCRIPTION
File::Find
is great, but constructing the wanted
routine can sometimes be a pain. This module provides a wanted
-writer, using syntax that is directly mappable to the find command's syntax.
Also, I find myself (heh) frequently just wanting the list of names that match. With File::Find
, I have to write a little accumulator, and then access that from a closure. But with File::Finder
, I can turn the problem inside out.
A File::Finder
object contains a hash of File::Find
options, and a series of steps that mimic find's predicates. Initially, a File::Finder
object has no steps. Each step method clones the previous object's options and steps, and then adds the new step, returning the new object. In this manner, an object can be grown, step by step, by chaining method calls. Furthermore, a partial sequence can be created and held, and used as the head of many different sequences.
For example, a step sequence that finds only files looks like:
my $files = File::Finder->type('f');
Here, type
is acting as a class method and thus a constructor. An instance of File::Finder
is returned, containing the one step to verify that only files are selected. We could use this immediately as a File::Find::find
wanted routine, although it'd be uninteresting:
use File::Find;
find($files, "/tmp");
Calling a step method on an existing object adds the step, returning the new object:
my $files_print = $files->print;
And now if we use this with find
, we get a nice display:
find($files_print, "/tmp");
Of course, we didn't really need that second object: we could have generated it on the fly:
find($files->print, "/tmp");
File::Find
supports options to modify behavior, such as depth-first searching. The depth
step flags this in the options as well:
my $files_depth_print = $files->depth->print;
However, the File::Finder
object needs to be told explictly to generate an options hash for File::Find::find
to pass this information along:
find($files_depth_print->as_options, "/tmp");
A File::Finder
object, like the find command, supports AND, OR, NOT, and parenthesized sub-expressions. AND binds tighter than OR, and is also implied everywhere that it makes sense. Like find, the predicates are computed in a "short-circuit" fashion, so that a false to the left of the (implied) AND keeps the right side from being evaluated, including entire parenthesized subexpressions. Similarly, if the left side of an OR is false, the right side is evaluated, and if the left side of the OR is true, the right side is skipped. Nested parens are handled properly. Parens are indicated with the rather ugly left
and right
methods:
my $big_or_old_files = $files->left->size("+50")->or->atime("+30")->right;
The parens here correspond directly to the parens in:
find somewhere -type f '(' -size +50 -o -atime +30 ')'
and are needed so that the OR and the implied ANDs have the right nesting.
Besides passing the constructed File::Finder
object to File::Finder::find
directly as a wanted
routine or an options hash, you can also call find
implictly, with in
. in
provides a list of starting points, and returns all filenames that match the criteria.
For example, a list of all names in /tmp can be generated simply with:
my @names = File::Finder->in("/tmp");
For more flexibility, use collect
to execute an arbitrary block in a list context, concatenating all the results (similar to map
):
my %sizes = File::Finder
->collect(sub { $File::Find::name => -s _ }, "/tmp");
That's all I can think of for now. The rest is in the detailed reference below.
META METHODS
All of these methods can be used as class or instance methods, except new
, which is usually not needed and is class only.
- new
-
Not strictly needed, because any instance method called on a class will create a new object anyway.
- as_wanted
-
Returns a subroutine suitable for passing to
File::Find::find
orFile::Find::finddepth
as the wanted routine. If the object is used in a place that wants a coderef, this happens automatically through overloading. - as_options
-
Returns a hashref suitable for passing to
File::Find::find
orFile::Find::finddepth
as the options hash. This is necessary if you want the meta-information to carry forward properly. - in(@starting_points)
-
Calls
File::Find::find($self->as_options, @starting_points)
, gathering the results, and returns the results as a list. At the moment, it also returns the count of those items in a scalar context. If that's useful, I'll maintain that. - collect($coderef, @starting_points)
-
Calls
$coderef
in a list context for each of the matching items, gathering and concatenating the results, and returning the results as a list.my $f = File::Finder->type('f'); my %sizes = $f->collect(sub { $File::Find::name, -s _ }, "/tmp");
In fact,
in
is implemented by callingcollect
with a coderef of justsub { $File::Find::name }
.
STEPS METHODS
These methods are called on a class or instance to add a "step". Each step adds itself to a list of steps, returning the new object. This allows you to chain steps together to form a formula.
As in find, the default operator is "and", and short-circuiting is performed.
- or
-
Like find's
or
. - left
-
Like a left parenthesis. Used in nesting pairs with
right
. - right
-
Like a right parenthesis. Used in nesting pairs with
left
. For example:my $big_or_old = File::Finder ->type('f') ->left ->size("+100")->or->mtime("+90") ->right; find($big_or_old->ls, "/tmp");
You need parens because the "or" operator is lower precedence than the implied "and", for the same reason you need them here:
find /tmp -type f '(' -size +100 -o -mtime +90 ')' -print
Without the parens, the -type would bind to -size, and not to the choice of -size or -mtime.
Mismatched parens will not be found until the formula is used, causing a fatal error.
- not
-
Like find's
!
. Prefix operator, can be placed in front of individual terms or open parens. Can be nested, but what's the point?# list all non-files in /tmp File::Finder->not->type('f')->ls->in("/tmp");
- true
-
Always returns true. Useful when a subexpression might fail, but you don't want the overall code to fail:
... ->left-> ...[might return false]... ->or->true->right-> ...
Of course, this is the find command's idiom of:
find .... '(' .... -o -true ')' ...
- false
-
Always returns false.
- comma
-
Like GNU find's ",". The result of the expression (or subexpression if in parens) up to this point is discarded, and execution continues afresh. Useful when a part of the expression is needed for its side effects, but shouldn't affect the rest of the "and"-ed chain.
# list all files and dirs, but don't descend into CVS dir contents: File::Finder->type('d')->name('CVS')->prune->comma->ls->in('.');
- follow
-
Enables symlink following, and returns true.
- name(NAME)
-
True if basename matches NAME, which can be given as a glob pattern or a regular expression object:
my $pm_files = File::Finder->name('*.pm')->in('.'); my $pm_files_too = File::Finder->name(qr/pm$/)->in('.');
- perm(PERMISSION)
-
Like find's
-perm
. Leading "-" means "all of these bits". Leading "+" means "any of these bits". Value is de-octalized if a leading 0 is present, which is likely only if it's being passed as a string.my $files = File::Finder->type('f'); # find files that are exactly mode 644 my $files_644 = $files->perm(0644); # find files that are at least world executable: my $files_world_exec = $files->perm("-1"); # find files that have some executable bit set: my $files_exec = $files->perm("+0111");
- type(TYPE)
-
Like find's
-type
. All native Perl types are supported. Note thats
is a socket, mapping to Perl's-S
, to be consistent with find. Returns true or false, as appropriate. -
Prints the fullname to
STDOUT
, followed by a newline. Returns true. - print0
-
Prints the fullname to
STDOUT
, followed by a NUL. Returns true. - fstype
-
Not implemented yet.
- user(USERNAME|UID)
-
True if the owner is USERNAME or UID.
- group(GROUPNAME|GID)
-
True if the group is GROUPNAME or GID.
- nouser
-
True if the entry doesn't belong to any known user.
- nogroup
-
True if the entry doesn't belong to any known group.
- links( +/- N )
-
Like find's
-links N
. Leading plus means "more than", minus means "less than". - inum( +/- N )
-
True if the inode number meets the qualification.
- size( +/- N [c/k])
-
True if the file size meets the qualification. By default, N is in half-K blocks. Append a trailing "k" to the number to indicate 1K blocks, or "c" to indicate characters (bytes).
- atime( +/- N )
-
True if access time (in days) meets the qualification.
- mtime( +/- N )
-
True if modification time (in days) meets the qualification.
- ctime( +/- N )
-
True if inode change time (in days) meets the qualification.
- exec(@COMMAND)
-
Forks the child process via
system()
. Any appearance of{}
in any argument is replaced by the current filename. Returns true if the child exit status is 0. The list is passed directly tosystem
, so if it's a single arg, it can contain/bin/sh
syntax. Otherwise, it's a pre-parsed command that must be found on the PATH.Note that I couldn't figure out how to horse around with the current directory very well, so I'm using
$_
here instead of the more traditionalFile::Find::name
. It still works, because we're still chdir'ed down into the directory, but it looks weird on a trace. Triggerno_chdir
infind
if you want a traditional find full path.my $f = File::Finder->exec('ls', '-ldg', '{}'); find({ no_chdir => 1, wanted => $f }, @starting_dirs);
Yeah, it'd be trivial for me to add a no_chdir method. Soon.
- ok(@COMMAND)
-
Like
exec
, but displays the command line first, and waits for a response. If the response begins withy
orY
, runs the command. If the command fails, or the response wasn't yes, returns false, otherwise true. - prune
-
Sets
$File::Find::prune
, and returns true. - xdev
-
Not yet implemented.
- newer
-
Not yet implemented.
- eval(CODEREF)
-
Ah yes, the master escape, with extra benefits. Give it a coderef, and it evaluates that code at the proper time. The return value is noted for true/false and used accordingly.
my $blaster = File::Finder->atime("+30")->eval(sub { unlink });
But wait, there's more. If the parameter is an object that responds to
as_wanted
, that method is automatically called, hoping for a coderef return. This neat feature allows subroutines to be created and nested:my $old = File::Finder->atime("+30"); my $big = File::Finder->size("+100"); my $old_or_big = File::Finder->eval($old)->or->eval($big); my $killer = File::Finder->eval(sub { unlink }); my $kill_old_or_big = File::Finder->eval($old_or_big)->ls->eval($killer); $kill_old_or_big->in('/tmp');
Almost too cool for words.
- depth
-
Like find's
-depth
. Sets a flag foras_options
, and returns true. - ls
-
Like find's
-ls
. Performs als -dils
on the entry toSTDOUT
(without forking), and returns true. - tar
-
Not yet implemented.
- [n]cpio
-
Not yet implemented.
- ffr($ffr_object)
-
Incorporate a
File::Find::Rule
object as a step. Note that this must be a rule object, and not a result, so don't call or passin
. For example, usingFile::Find::Rule::ImageSize
to define a predicate for image files that are bigger than a megapixel in my friends folder, I get:require File::Finder; require File::Find::Rule; require File::Find::Rule::ImageSize; my $ffr = File::Find::Rule->file->image_x('>1000')->image_y('>1000'); my @big_friends = File::Finder->ffr($ffr) ->in("/Users/merlyn/Pictures/Sorted/Friends");
EXTENDING
The steps methods are actually in the File::Finder::Steps
class. You can add more subroutines to that package directly, or subclass that class. If you subclass that class, you should subclass File::Finder
and override the _steps_class
method to return your new subclass name.
The exact protocol of a step generator is in the source code, and won't be repeated here. If you're smart enough to want to extend this, you're smart enough to find the source. {grin}
SPEED
All the steps can have a compile-time and run-time component. As much work is done during compile-time as possible. Runtime consists of a simple linear pass executing a series of closures representing the individual steps (not method calls). It is hoped that this will produce a speed that is within a factor of 2 or 3 of a handcrafted monolithic wanted
routine.
SEE ALSO
File::Find
, find2perl, File::Find::Rule
BUGS
None known yet.
AUTHOR
Randal L. Schwartz, <merlyn@stonehenge.comt>, with a tip of the hat to Richard Clamp for File::Find::Rule
.
COPYRIGHT AND LICENSE
Copyright (C) 2003 by Randal L. Schwartz, Stonehenge Consulting Services, Inc.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.