NAME
Directory::Diff - recursively find differences between similar directories
SYNOPSIS
use Directory::Diff 'directory_diff';
use FindBin '$Bin';
# Do a "diff" between "old_dir" and "new_dir"
directory_diff ("$Bin/old_dir", "$Bin/new_dir",
{diff => \& diff,
dir1_only => \& old_only});
# User-supplied callback for differing files
sub diff
{
my ($data, $dir1, $dir2, $file) = @_;
print "$dir1/$file is different from $dir2/$file.\n";
}
# User-supplied callback for files only in one of the directories
sub old_only
{
my ($data, $dir1, $file) = @_;
print "$dir1/$file is only in the old directory.\n";
}
produces output
/usr/home/ben/projects/directory-diff/examples/old_dir/old-file is only in the old directory.
/usr/home/ben/projects/directory-diff/examples/old_dir/diff-file is different from /usr/home/ben/projects/directory-diff/examples/new_dir/diff-file.
(This example is included as synopsis.pl in the distribution.)
VERSION
This documents Directory-Diff version 0.08 corresponding to git commit 80910e5ff3eaeb27ebbe432883b1798ab6dc6ebe released on Sat Nov 17 06:38:54 2018 +0900.
DESCRIPTION
Directory::Diff finds differences between two directories and all their subdirectories, recursively. If it finds a file with the same name in both directories, it uses File::Compare to find out whether they are different. It is callback-based and takes actions only if required.
FUNCTIONS
The main function of this module is "directory_diff". The other functions listed here are helper functions, but these can be exported on request.
directory_diff
directory_diff ("dir1", "dir2",
{dir1_only => \&dir1_only,
diff => \& diff});
Given two directories dir1 and dir2, this calls back a user-supplied routine for each of three cases:
- A file is only in the first directory
-
In this case a callback specified by
dir1_only
is called once&{$third_arg->{dir1_only}} ($third_arg->{data}, "dir1", $file);
for each file
$file
which is in dir1 but not in dir2, including files in subdirectories. - A file is only in the second directory
-
In this case a callback specified by
dir2_only
is called once&{$third_arg->{dir2_only}} ($third_arg->{data}, "dir2", $file);
for each file
$file
which is in dir2 but not in dir1, including files in subdirectories. - A file with the same name but different contents is in both directories
-
In this case a callback specified by
diff
is called once&{$third_arg->{diff}} ($third_arg->{data}, "dir1", "dir2", $file);
for each file name
$file
which is in both dir1 and in dir2, including files in subdirectories.
The first argument to each of the callback functions is specified by data
. The second argument to dir1_only
and dir2_only
is the directory's name. The third argument is the file name, which includes the subdirectory part. The second and third arguments to diff
are the two directories, and the fourth argument is the file name including the subdirectory part.
If the user does not supply a callback, no action is taken, even if a file is found.
The routine does not return a meaningful value. It does not check the return values of the callbacks. Therefore if it is necessary to stop midway, the user must use something like eval { }
and die
.
A fourth argument, if set to any true value, causes directory_diff to print messages about what it finds and what it does.
ls_dir
my %ls = ls_dir ("dir");
ls_dir
makes a hash containing a true value for each file and directory which is found under the directory given as the first argument.
If a second argument with a true value is set, it prints debugging messages. For example
my %ls = ls_dir ("dir", 1);
get_only
my %only = get_only (\%dir1, \%dir2);
Given two hashes containing true values for each file or directory under two directories, return a hash containing true values for the files and directories which are in the first directory hash but not in the second directory hash.
For example, if
%dir1 = ("file" => 1, "dir/" => 1, "dir/file" => 1);
and
%dir2 = ("dir/" => 1, "dir2/" => 1);
get_only
returns
%only = ("file" => 1, "dir/file" => 1);
A third parameter for debugging makes the module print messages on what is found if set to a true value, for example,
my %only = get_only (\%dir1, \%dir2, 1);
get_diff
my %diff = get_diff ("dir1", \%dir1_ls, "dir2", \%dir2_ls);
Get a list of files which are in both dir1
and dir2
, but which are different. This uses File::Compare to test the files for differences. It searches subdirectories. Usually the hashes %dir1_ls
and %dir2_ls
are those output by "ls_dir".
SEE ALSO
CPAN modules
- File::DirCompare
-
This is similar to Directory::Diff. Unlike Directory::Diff, it does not descend into subdirectories which exist in one directory but not the other.
- File::Dircmp
-
This mimics the output of the Unix
diff
command. Unlike Directory::Diff, it does not descend into subdirectories which exist in one directory but not the other. - Test::Dirs
- Compare::Directory
DEPENDENCIES
This section lists Perl modules which this depends on, with a rationale for why they are used.
- Carp
-
croak
andcarp
are used to report errors. - File::Compare
-
File::Compare is used to check whether two identically-named files are different or not.
- "getcwd" in Cwd
-
This is used to get the working directory of the module, since it changes directory to the directory where the diff is performed.
- File::Copy
- File::Path
CONTRIBUTORS
This section lists people who have contributed to the module.
Mohammad S. Anwar (MANWAR) contributed fixes for broken tests.
MOTIVATION
This section discusses why I wrote the module and what I use it for.
The reason I wrote this module is because `diff --recursive`
stops when it finds a subdirectory which is in one directory and not the other, without descending into the subdirectory. For example, if one has a file like dir1/subdir/file
,
diff -r dir1 dir2
will tell you "Only in dir1: subdir" but it won't tell you anything about the files under "subdir". The two Perl modules on CPAN, "File::Dircmp" and "File::DirCompare" both also stop processing when subdirectories differ.
For my task, I needed to go down into the subdirectory and find all the files which were in all the subdirectories, so I wrote this.
I've been using this module for updating web sites with a lot of pages since 2009, to avoid repeatedly having to upload the entire site's-worth of pages for each small change. The way I use this is as follows. I keep a local copy of the uploaded web site in a directory like old-site, and then rebuild all the pages in another directory like new-site, then I use Directory::Diff::Copy to put the changed files into yet another directory, like changed-site-files. Once the changed files are copied, then I tar, gzip, and upload the directory of changed files, and untar it at the web host, thus replacing only files which have changed. I also delete the old-site directory and rename new-site to old-site at this point in preparation for the next upload.
I'm currently using this for almost all the static content for the following web sites: http://www.sljfaq.org, http://kanji.sljfaq.org, and http://www.lemoda.net. I put this module on github in about 2012 and on CPAN in 2016.
AUTHOR
Ben Bullock, <bkb@cpan.org>
COPYRIGHT & LICENCE
This package and associated files are copyright (C) 2009-2018 Ben Bullock.
You can use, copy, modify and redistribute this package and associated files under the Perl Artistic Licence or the GNU General Public Licence.