The Perl and Raku Conference 2025: Greenville, South Carolina - June 27-29 Learn more

=head1 NAME
B<Test::Files> - A L<Test::Builder|https://metacpan.org/pod/Test::Builder>
based module to ease testing with files and dirs.
In general, the following can be tested:
=over 2
=item
If the contents of the file being tested match the expected pattern.
=item
If the file being tested is identical to the expected file in regard to contents, or size, or existence.
If necessary, some parts of the contents can be excluded from the comparison.
=item
If the directory being tested contains all expected files.
=item
If the files in the directory being tested are identical to the files in the reference directory
in regard to contents, or size, or existence.
If necessary, some files as well as some parts of contents can be excluded from the comparison.
=item
If all files in the directory being tested fulfill certain requirements.
=item
If the archive (container) being tested is logically identical to the the reference archive (container).
If necessary, some members of archives, as well as some parts of their contents, as well as some metadata
can be excluded from the comparison.
=back
=head1 SYNOPSIS
All examples listed below can be found and executed using B<xt/synopsis.t>
use Path::Tiny qw( path );
use Test::Files;
my $got_file = path( 'path' )->child( qw( got file ) );
my $reference_file = path( 'path' )->child( qw( reference file ) );
my $got_dir = path( 'path' )->child( qw( got dir ) );
my $reference_dir = path( 'path' )->child( qw( reference dir with some stuff ) );
my @file_list = qw( expected file );
my ( $content_check, $expected, $filter, $options );
plan( 24 );
# Simply compares file contents to a string:
$expected = "contents\nof file";
file_ok( $got_file, $expected, 'got file has expected contents' );
# Two identical variants comparing file contents
# to a string ignoring differences in time stamps:
$expected = "filtered contents\nof file\ncreated at 00:00:00";
$filter = sub {
shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
{00:00:00}grx
};
$options = { FILTER => $filter };
file_ok (
$got_file, $expected, $options,
"'$got_file' has contents expected after filtering"
);
file_filter_ok(
$got_file, $expected, $filter,
"'$got_file' has contents expected after filtering"
);
# Simply compares two file contents:
compare_ok( $got_file, $reference_file, 'files are the same' );
# Two identical variants comparing contents of two files
# ignoring differences in time stamps:
$filter = sub {
shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
{00:00:00}grx
};
$options = { FILTER => $filter };
compare_ok (
$got_file, $reference_file, $options, 'files are almost the same'
);
compare_filter_ok(
$got_file, $reference_file, $filter, 'files are almost the same'
);
# Verifies if both got file and reference file exist:
$options = { EXISTENCE_ONLY => 1 };
compare_ok( $got_file, $reference_file, $options, 'both files exist' );
# Verifies if got file and reference file have identical size:
$options = { SIZE_ONLY => 1 };
compare_ok(
$got_file, $reference_file, $options, 'both files have identical size'
);
# Verifies if the directory has all expected files (not recursively!):
$expected = [ qw( files got_dir must contain ) ];
dir_contains_ok( $got_dir, $expected, 'directory has all files in list' );
# Two identical variants doing the same verification as before,
# but additionally verifying if the directory has nothing
# but the expected files (not recursively!):
$options = { SYMMETRIC => 1 };
dir_contains_ok (
$got_dir, $expected, $options, 'directory has exactly the files in the list'
);
dir_only_contains_ok(
$got_dir, $expected, 'directory has exactly the files in the list'
);
# The same as before, but recursive:
$options = { RECURSIVE => 1, SYMMETRIC => 1 };
dir_contains_ok(
$got_dir, $expected, $options,
'directory and its subdirectories have exactly the files in the list'
);
# The same as before, but ignoring files,
# which names do not match the required pattern (file "must" will be skipped):
$options = { NAME_PATTERN => '^[cfg]', RECURSIVE => 1, SYMMETRIC => 1 };
dir_contains_ok(
$got_dir, $expected, $options,
'directory and its subdirectories ' .
"have exactly the files in the list except of file 'must'"
);
# Compares two directories by comparing file contents (not recursively!):
compare_dirs_ok(
$got_dir, $reference_dir,
"all files from '$got_dir' are the same in '$reference_dir' " .
'(same names, same contents), subdirs are skipped'
);
# The same as before, but subdirectories are considered, too:
$options = { RECURSIVE => 1 };
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' and its subdirs are the same in '$reference_dir'"
);
# The same as before, but only file sizes are compared:
$options = { RECURSIVE => 1, SIZE_ONLY => 1 };
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' and its subdirs have same sizes in '$reference_dir'"
);
# The same as before, but only file existence is verified:
$options = { EXISTENCE_ONLY => 1, RECURSIVE => 1 };
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' and its subdirs exist in '$reference_dir'"
);
# The same as before, but only files with base names starting with 'A' are considered:
$options = { EXISTENCE_ONLY => 1, NAME_PATTERN => '^A', RECURSIVE => 1 };
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' and its subdirs " .
"with base names starting with 'A' exist in '$reference_dir'"
);
# The same as before, but the symmetric verification is requested:
$options = {
EXISTENCE_ONLY => 1,
NAME_PATTERN => '^A',
RECURSIVE => 1,
SYMMETRIC => 1,
};
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' and its subdirs with base names " .
"starting with 'A' exist in '$reference_dir' and vice versa"
);
# Two identical variants of comparison of two directories by file contents,
# whereas these contents are first filtered
# so that time stamps in form of 'HH:MM:SS' are replaced by '00:00:00'
# like in examples for file_filter_ok and compare_filter_ok:
$filter = sub {
shift =~ s{ \b (?: [01] \d | 2 [0-3] ) : (?: [0-5] \d ) : (?: [0-5] \d ) \b }
{00:00:00}grx
};
$options = { FILTER => $filter };
compare_dirs_ok(
$got_dir, $reference_dir, $options,
"all files from '$got_dir' are the same in '$reference_dir', " .
'subdirs are skipped, differences of time stamps ignored'
);
compare_dirs_filter_ok(
$got_dir, $reference_dir, $filter,
"all files from '$got_dir' are the same in '$reference_dir', " .
'subdirs are skipped, differences of time stamps ignored'
);
# Verifies if all plain files in directory and its subdirectories
# contain the word 'good' (take into consideration the -f test below
# excluding special files from comparison!):
$content_check = sub {
my ( $file ) = @_;
! -f $file or path( $file )->slurp =~ / \b good \b /x;
};
$options = { RECURSIVE => 1 };
find_ok(
$got_dir, $content_check, $options,
"all files from '$got_dir' and subdirectories contain the word 'good'"
);
# Compares PKZIP archives considering both global and file comments.
# Both archives contain the same members in different order:
my $extract = sub {
my ( $file ) = @_;
my $zip = Archive::Zip->new();
die( "Cannot read '$file'" ) if $zip->read( $file ) != AZ_OK;
die( "Cannot extract from '$file'" ) if $zip->extractTree != AZ_OK;
};
my $meta_data = sub {
my ( $file ) = @_;
my $zip = Archive::Zip->new();
die( "Cannot read '$file'" ) if $zip->read( $file ) != AZ_OK;
my %meta_data = ( '' => $zip->zipfileComment );
$meta_data{ $_->fileName } = $_->fileComment foreach $zip->members;
return \%meta_data;
};
my $got_compressed_content = path( "$got_file.zip" )->slurp;
my $reference_compressed_content = path( "$reference_file.zip" )->slurp;
ok(
$got_compressed_content ne $reference_compressed_content,
"'$got_file.zip' and '$reference_file.zip' are physically different, but"
);
compare_archives_ok(
"$got_file.zip", "$reference_file.zip", { EXTRACT => $extract, META_DATA => $meta_data },
"'$got_file.zip' and '$reference_file.zip' are logically identical"
);
=head1 DESCRIPTION
This module is like L<Test2::V0|https://metacpan.org/pod/Test2::V0> or
in fact you should use that first as shown above.
It supports comparison of files and directories in different ways.
Any file or directory passed to functions of this module can be both a string or an object of
Though the test names i.e. the last parameter of every function is optional,
you should provide a name of each test for a better maintainability.
You should follow the lead of the L</SYNOPSIS> examples and use L<Path::Tiny|https://metacpan.org/pod/Path::Tiny> or,
if you prefer, L<File::Spec|https://metacpan.org/pod/File::Spec>.
This makes it much more likely that your tests will pass on a different operating system.
All of the contents comparison routines provide diff diagnostic output when they report failure.
The diff output style can be changed using the option B<STYLE> (see below).
The filter function receives each line of each file.
It may perform any necessary transformations (like excising dates),
then it must return the line in (possibly) transformed state.
For example, the first filter of L<Phil Crow|https://metacpan.org/author/PHILCROW>, the creator of this module was
sub chop_dates {
my $line = shift;
$line =~ s/\d{4}(.\d\d){5}//g;
return $line;
}
This removes all strings like B<2003.10.14.14.17.37>.
Everything else is unchanged and failing tests started passing when they should.
If you want to exclude the line from consideration, return empty string or B<undef>.
=head2 FUNCTIONS
=head3 file_ok
There are two forms of calls:
=over 2
=item The generic form.
B<file_ok( $got_file, $expected_string, \%options, $test_name )>
=item The short form, which is also backward compatible.
B<file_ok( $got_file, $expected_string, $test_name )>
=back
Compares the contents of a file B<$got_file> to a string B<$expected_string>.
In the generic form, if the parameter B<\%options> is passed and contains the key B<FILTER>,
B<file_ok> provides the same functionality as B<file_filter_ok>.
Supported options:
=over 2
=item B<FILTER>
Code reference providing filtering of file contents before comparison.
The only expected parameter is the current line from the file contents, the return value replaces this line.
In addition, the special variable B<$.> representing the number of the current line in the file can be used.
If the return value is undefined, empty string is returned instead.
Line breaks are neither removed nor added after the execution.
Defaults to B<undef> i.e. no filtering is provided.
=item All options supported by L<Text::Diff|https://metacpan.org/pod/Text::Diff>
except of B<FILENAME_A> and B<FILENAME_B>.
The most useful of them seems to be B<STYLE> defining the style of output for content differences.
Defaults to B<Unified>.
=back
=head3 file_filter_ok
There is only one form of call namely B<file_filter_ok( $got_file, $expected_string, \&filter_func, $test_name )>.
Works like B<file_ok> with the option B<FILTER> i.e. compares the contents of a file to a string,
but filters the file first using B<&filter_func> for that. The string contents must be filtered before if necessary.
This function is deprecated and stays for backward compatibility reasons only.
=head3 compare_ok
There are two forms of calls:
=over 2
=item The generic form.
B<compare_ok( $got_file, $reference_file, \%options, $test_name )>
=item The short form, which is also backward compatible.
B<compare_ok( $got_file, $reference_file, $test_name )>
=back
Compares two files.
In the generic form, if the parameter B<\%options> is passed and contains the key B<FILTER>,
B<compare_ok> provides the same functionality as B<compare_filter_ok>.
Supported options:
=over 2
=item B<EXISTENCE_ONLY>
Boolean. If set to B<true>, only existence of both B<$got_file> and B<$reference_file> is compared.
Defaults to B<false>.
=item B<FILTER>
Code reference providing filtering of file contents before comparison and
being applied to both B<$got_file> and B<$reference_file>.
The only expected parameter is the current line from the file contents, the return value replaces this line.
In addition, the special variable B<$.> representing the number of the current line in the file can be used.
If the return value is undefined, empty string is returned instead.
Line breaks are neither removed nor added after the execution.
Ignored if either B<EXISTENCE_ONLY> or B<SIZE_ONLY> is set to B<true>.
Defaults to B<undef> i.e. no filtering is provided.
=item B<SIZE_ONLY>
Boolean. If set to B<true> and the options B<EXISTENCE_ONLY> is not set to B<true>,
B<$got_file> and B<$reference_file> are compared by size only.
Defaults to B<false>.
=item All options supported by L<Text::Diff|https://metacpan.org/pod/Text::Diff>
except of B<FILENAME_A> and B<FILENAME_B>.
The most useful of them seems to be B<STYLE> defining the style of output for content differences.
Defaults to B<Unified>.
=back
=head3 compare_filter_ok
There is only one form of call namely B<compare_filter_ok( $got_file, $reference_file, \&filter_func, $test_name )>.
Works like B<compare_ok> with option B<FILTER> i.e. compares the contents of two files,
but sends each line through the filter B<&filter_func> so things that shouldn't count against success can be stripped.
This function is deprecated and stays for backward compatibility reasons only.
=head3 dir_contains_ok
There are two forms of calls:
=over 2
=item The generic form.
B<dir_contains_ok( $got_dir, \@file_list, \%options, $test_name )>
=item The short form, which is also backward compatible.
B<dir_contains_ok( $got_dir, \@file_list, $test_name )>
=back
Verifies the directory B<$got_dir> for the presence of a list files in B<@file_list>.
If B<$got_dir> is a symlink, this will be accepted, but symlinks therein are not followed.
Subdirectories are not involved in the verification, but files located therein are considered
if recursive appraoch is required (see the option B<RECURSIVE> below).
Special files like named pipes are involved in the verification only if the sole file existence is required
(see the option B<EXISTENCE_ONLY> below), otherwise they are skipped and reported as error.
In the generic form, if the parameter B<\%options> is passed and
contains the key B<SYMMETRIC> set to B<true>, B<dirs_contains_ok> provides the same functionality
as B<dir_only_contains_ok>.
Supported options:
=over 2
=item B<NAME_PATTERN>
String containing RegEx. Files with base names not matching this RegEx will be skipped.
Defaults to the dot sign (B<.>) i.e. no file will be skipped.
=item B<RECURSIVE>
Boolean. If set to B<true>, subdirectories of B<$got_dir> will be checked, too.
Defaults to B<false>.
=item B<SYMMETRIC>
Boolean. If set to B<true>, additionally verifies if all files from B<$got_dir> are listed in B<@file_list>.
Defaults to B<false>.
=back
=head3 dir_only_contains_ok
There is only one form of call namely B<dir_only_contains_ok( $got_dir, \@file_list, $test_name )>.
Works like B<dir_contains_ok> with option B<SYMMETRIC> set to B<true> i.e.
checks directory without following symlinks therein to ensure
that the listed files are present and that they are the only ones present.
This function is deprecated and stays for backward compatibility reasons only.
=head3 compare_dirs_ok
There are two forms of calls:
=over 2
=item The generic form.
B<compare_dirs_ok( $got_dir, $reference_dir, \%options, $test_name )>
=item The short form, which is also backward compatible.
B<compare_dirs_ok( $got_dir, $reference_dir, $test_name )>
=back
Compares all files in the directories B<$got_dir> and B<$reference_dir> reporting differences.
If B<$got_dir> or B<$reference_dir> is a symlink, this will be accepted, but symlinks therein are not followed.
In the generic form, if the parameter B<\%options> is passed and contains the key B<FILTER>,
B<compare_dirs_ok> provides the same functionality as B<compare_dirs_filter_ok>.
Supported options:
=over 2
=item B<EXISTENCE_ONLY>
Boolean. If set to B<true>, only checks if every file from B<$reference_dir> is found in B<$got_dir>.
Defaults to B<false>.
=item B<FILTER>
Code reference providing filtering of file contents before comparison and
applied to files from both B<$got_dir> and B<$reference_dir>.
The only expected parameter is the current line from the file contents, the return value replaces this line.
In addition, the special variable B<$.> representing the number of the current line in the file can be used.
If the return value is undefined, empty string is returned instead.
Line breaks are neither removed nor added after the execution.
Ignored if either B<EXISTENCE_ONLY> or B<SIZE_ONLY> is set to B<true>.
Defaults to B<undef> i.e. no filtering is provided.
=item B<NAME_PATTERN>
String containing RegEx.
Files with base names not matching this RegEx will be skipped both in B<$got_dir> and B<$reference_dir>.
Defaults to the dot sign (B<.>) i.e. no file will be skipped.
=item B<RECURSIVE>
Boolean. If set to B<true>, subdirectories of both B<$got_dir> and B<$reference_dir> will be checked, too.
Defaults to B<false>.
=item B<SIZE_ONLY>
Boolean. If set to B<true> and the options B<EXISTENCE_ONLY> is not set to B<true>,
files from B<$got_dir> and B<$reference_dir> are compared by size only.
Defaults to B<false>.
=item B<SYMMETRIC>
Boolean. If set to B<true>, additionally verifies if all files from B<$got_dir> exist in B<$reference_dir>, too.
Defaults to B<false>.
=item All options supported by L<Text::Diff|https://metacpan.org/pod/Text::Diff>
except of B<FILENAME_A> and B<FILENAME_B>.
The most useful of them seems to be B<STYLE> defining the style of output for content differences.
Defaults to B<Unified>.
=back
=head3 compare_dirs_filter_ok
There is only one form of call namely B<compare_dirs_filter_ok( $got_dir, $reference_dir, \&filter_func, $test_name )>.
Works like B<compare_dirs_ok> with option B<FILTER> i.e. calls the filter function B<&filter_func> on each line
of every file allowing you to exclude or alter some text to avoid spurious failures (like timestamp disagreements).
This function is deprecated and stays for backward compatibility reasons only.
=head3 find_ok
The signature is B<find_ok( $got_dir, \&content_check_func, \%options, $test_name )>.
Verifies if the condition B<&content_check_func> is true for all files in directory B<$got_dir>.
The code reference B<&content_check_func> returning boolean is called for any type of file except of directory
i.e. for symlinks, devices, etc and the only parameter is the full-qualified file name.
If you want to consider plain files only, you must apply the test operator B<-f> to the parameter
like shown in L</SYNOPSIS>.
Supported options:
=over 2
=item B<RECURSIVE>
Boolean. If set to B<true>, subdirectories of B<$got_dir> will be checked, too.
Defaults to B<false>.
=back
=head3 compare_archives_ok
The signature is B<compare_archives_ok( $got_archive, $reference_archive, \%options, $test_name )>.
Verifies if the archives (containers) B<$got_archive> and B<$reference_archive> are logically identical.
The term "logically identical" means that these files might be physically different e.g. because their members are
stored in different order, or because some members are marked as deleted, but the metadata relevant for the current
test case and the members are identical.
Which metadata and which members must be compared can be controlled using B<\%options>.
The comparison itself begins with the extraction and comparison of metadata;
if they are not identical, no further comparison is provided and the test fails.
If the metadata comparison succeeds, members of B<$got_archive> and B<$reference_archive> are extracted in
temporary directories and compared in the same manner like B<compare_dirs_ok> this does.
Supported options:
=over 2
=item All options supported by B<compare_dirs_ok>.
=item B<EXTRACT>
Code reference. Extracts members from the archive in the current directory.
The only expected parameter is the archive file name.
The current directory at the time point of extraction is a temporary directory that is removed after the test.
The return value is ignored.
Defaults to empty function B<sub {}>.
=item B<META_DATA>
Code reference. Returns metadata e.g. comments from a PKZIP archive.
The only expected parameter is the archive file name.
Defaults to empty function B<sub {}>.
=back
=head1 SEE ALSO
and L<Test::Builder|https://metacpan.org/pod/Test::Builder> for more testing help.
This module really just adds functions to what L<Test2::V0|https://metacpan.org/pod/Test2::V0> does.
As recommended by the author of L<Test::More|https://metacpan.org/pod/Test::More>
and L<Test2::V0|https://metacpan.org/pod/Test2::V0>, the latter module should be preferred,
that's why L<Test::More|https://metacpan.org/pod/Test::More> is not listed in L</SYNOPSIS>.
=head1 BUGS
Please report any bugs or feature requests through the web interface at
=head1 CAVEATS
Although this module can cope with binary files, too, confirming their equality,
but in case of differences a proper representation of comparison results is not guaranteed.
=head1 AUTHOR
Phil Crow, E<lt>philcrow2000@yahoo.comE<gt>
Jurij Fajnberg, E<lt>fajnbergj@gmail.comE<gt>
=head1 COPYRIGHT AND LICENSE
Copyright 2003-2007 by Phil Crow
Copyright 2020-2024 by Jurij Fajnberg
This module is free software; you can redistribute it and/or modify
it under the same terms as the Perl 5 programming language system itself.
=cut