NAME
Devel::Git::MultiBisect - Study test output over a range of git commits
SYNOPSIS
You will typically construct an object of a class which is a child of Devel::Git::MultiBisect, such as Devel::Git::MultiBisect::AllCommits or Devel::Git::MultiBisect::Transitions. All methods documented in this parent package may be called from either child class.
use Devel::Git::MultiBisect::AllCommits;
$self = Devel::Git::MultiBisect::AllCommits->new(\%parameters);
... or
use Devel::Git::MultiBisect::Transitions;
$self = Devel::Git::MultiBisect::Transitions->new(\%parameters);
... and then:
$commit_range = $self->get_commits_range();
$full_targets = $self->set_targets(\@target_args);
$outputs = $self->run_test_files_on_one_commit($commit_range->[0]);
... followed by methods specific to the child class.
... and then perhaps also:
$timings = $self->get_timings();
DESCRIPTION
Given a Perl library or application kept in git for version control, it is often useful to be able to compare the output collected from running one or several test files over a range of git commits. If that range is sufficiently large, a test may fail in more than one way over that range.
If that is the case, then simply asking, "When did this file start to fail?" is insufficient. We may want to (a) capture the test output for each commit; or, (b) capture the test output only at those commits where the output changed. The output of a run of a test file may change for a variety of reasons: test failures, segfaults, changes in the number or content of tests, etc.)
Devel::Git::MultiBisect provides methods to achieve that objective. Its child classes, Devel::Git::MultiBisect::AllCommits and Devel::Git::MultiBisect::Transitions, provide different flavors of that functionality for objectives (a) and (b), respectively. Please refer to their documentation for further discussion.
GLOSSARY
commit
An individual commit to a git repository as denoted by a SHA. When a commit is called for as the argument to a function, you can also use a git tag.
commit range
The range of sequential commits (determined by git log) requested for analysis.
target
A test file from the test suite of the application or library under study.
test output
What is sent to STDOUT or STDERR as a result of calling a test program such as prove or t/harness on an individual target file.
transitional commit
A commit at which the test output for a given target changes from that of the commit immediately preceding.
digest
A string holding the output of a cryptographic process run on test output which uniquely identifies that output. (Currently, we use the
Digest::SHA::md5_hex
algorithm.) We assume that if the test output does not change between one or more commits, then that commit is not a transitional commit.Note: Before taking a digest on a particular test output, we exclude text such as timings which are highly likely to change from one run to the next and which would introduce spurious variability into the digest calculations.
multisection or multibisection
A series of configure-build-test process sequences at those commits within the commit range which are selected by a bisection algorithm.
Normally, when we bisect (via git bisect, Porting/bisect.pl or otherwise), we are seeking a single point where a Boolean result -- yes/no, true/false, pass/fail -- is returned. What the test run outputs to STDOUT or STDERR is a lesser concern.
In multisection we bisect repeatedly to determine all points where the output of the test command changes -- regardless of whether that change is a
PASS
,FAIL
or whatever. We capture the output for later human examination.
METHODS
new()
Purpose
Constructor.
Arguments
$self = Devel::Git::MultiBisect::AllCommits->new(\%params);
or
$self = Devel::Git::MultiBisect::Transitions->new(\%params);
Reference to a hash, typically the return value of
Devel::Git::MultiBisect::Opts::process_options()
.The hashref passed as argument must contain key-value pairs for
gitdir
,workdir
andoutputdir
.new()
tests for the existence of each of these directories.Return Value
Object of Devel::Git::MultiBisect child class.
get_commits_range()
Purpose
Identify the SHAs of each git commit identified by
new()
.Arguments
$commit_range = $self->get_commits_range();
None; all data needed is already in the object.
Return Value
Array reference, each element of which is a SHA.
set_targets()
Purpose
Identify the test files which will be run at different points in the commits range. We shall assume that the test file has existed with its name unchanged over the entire commit range.
Arguments
$target_args = [ 't/44_func_hashes_mult_unsorted.t', 't/45_func_hashes_alt_dual_sorted.t', ]; $full_targets = $self->set_targets($target_args);
Reference to an array holding the relative paths beneath the
gitdir
to the test files selected for examination.Return Value
Reference to an array holding hash references with these elements:
path
Absolute paths to the test files selected for examination. Test file is tested for its existence.
stub
String composed by taking an element in the array ref passed as argument and substituting underscores C(<_>) for forward slash (
/
) and dot (.
) characters. So,t/44_func_hashes_mult_unsorted.t
... becomes:
t_44_func_hashes_mult_unsorted_t
run_test_files_on_one_commit()
Purpose
Capture the output from running the selected test files at one specific git checkout.
Arguments
$outputs = $self->run_test_files_on_one_commit("2a2e54a");
or
$excluded_targets = [ 't/45_func_hashes_alt_dual_sorted.t', ]; $outputs = $self->run_test_files_on_one_commit("2a2e54a", $excluded_targets);
String holding the SHA from a single commit in the repository. This string would typically be one of the elements in the array reference returned by
$self-
get_commits_range()>. If no argument is provided, the method will default to using the first element in the array reference returned by$self-
get_commits_range()>.Reference to array of target test files to be excluded from a particular invocation of this method. Optional, but will die if argument is not an array reference.
Return Value
Reference to an array, each element of which is a hash reference with the following elements:
commit
String holding the SHA from the commit passed as argument to this method (or the default described above).
commit_short
String holding the value of
commit
(above) to the number of characters specified in theshort
element passed to the constructor; defaults to 7.file_stub
String holding a rewritten version of the relative path beneath
gitdir
of the test file being run. In this relative path forward slash (/
) and dot (.
) characters are changed to underscores C(<_>). So,t/44_func_hashes_mult_unsorted.t
... becomes:
t_44_func_hashes_mult_unsorted_t'
file
String holding the full path to the file holding the TAP output collected while running one test file at the given commit. The following example shows how that path is calculated. Given:
output directory (outputdir) => '/tmp/DQBuT_SRAY/' SHA (commit) => '2a2e54af709f17cc6186b42840549c46478b6467' shortened SHA (commit_short) => '2a2e54a' test file (target->[$i]) => 't/44_func_hashes_mult_unsorted.t'
... the file is placed in the directory specified by
outputdir
. We then joincommit_short
(the shortened SHA),file_stub
(the rewritten relative path) and the stringsoutput
andtxt
with a dot to yield this value for thefile
element:2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt
md5_hex
String holding the return value of
Devel::Git::MultiBisect::Auxiliary::hexdigest_one_file()
run with the file designated by thefile
element as an argument. (More precisely, the file as modified byDevel::Git::MultiBisect::Auxiliary::clean_outputfile()
.)
Example:
[ { commit => "2a2e54af709f17cc6186b42840549c46478b6467", commit_short => "2a2e54a", file => "/tmp/1mVnyd59ee/2a2e54a.t_44_func_hashes_mult_unsorted_t.output.txt", file_stub => "t_44_func_hashes_mult_unsorted_t", md5_hex => "31b7c93474e15a16d702da31989ab565", }, { commit => "2a2e54af709f17cc6186b42840549c46478b6467", commit_short => "2a2e54a", file => "/tmp/1mVnyd59ee/2a2e54a.t_45_func_hashes_alt_dual_sorted_t.output.txt", file_stub => "t_45_func_hashes_alt_dual_sorted_t", md5_hex => "6ee767b9d2838e4bbe83be0749b841c1", }, ]
Comment
In this method's current implementation, we start with a
git checkout
from the repository at the specifiedcommit
. We configure (e.g.,perl Makefile.PL
) and build (e.g.,make
) the source code. We then test each of the test files we have targeted (e.g.,prove -vb relative/path/to/test_file.t
). We redirect both STDOUT and STDERR tooutputfile
, clean up the outputfile to remove the line containing timings (as that introduces unwanted variability in themd5_hex
values) and compute the digest.This implementation is very much subject to change.
If a true value for
verbose
has been passed to the constructor, the method printsCreated [outputfile]
to STDOUT before returning.Note: While this method is publicly documented, in actual use you probably will not need to call it directly. Instead, you will probably use either
Devel::Git::MultiBisect::AllCommits::run_test_files_on_all_commits()
orDevel::Git::MultiBisect::Transitions::multisect_all_targets()
.
get_timings()
Purpose
Get information on the time a multisection took to run.
Arguments
None; all data needed is already in the object.
Return Value
Hash reference. The selection of elements in this hashref will depend on which subclass of Devel::Git::MultiBisect you are using and may differ among subclasses. Example:
{ elapsed => 4297, mean => 186.83, runs => 23 }
In this example (taken from a run of one test file over 220 commits in Perl 5 blead), 23 runs were needed to achieve a result. These took 4297 seconds (approximately 71 minutes) with a mean run time of approximately 3 minutes each.
Method will return undefined value if timings are not yet available within the object.
SUPPORT
Please report any bugs by mail to bug-Devel-Git-MultiBisect@rt.cpan.org
or through the web interface at http://rt.cpan.org.
AUTHOR
James E. Keenan (jkeenan at cpan dot org). When sending correspondence, please include 'Devel::Git::MultiBisect' or 'Devel-Git-MultiBisect' in your subject line.
Creation date: November 16 2016. Last modification date: November 16 2016.
Development repository: https://github.com/jkeenan/devel-git-multibisect
ACKNOWLEDGEMENTS
Thanks to the following contributors and reviewers:
Smylers
For naming suggestion: http://www.nntp.perl.org/group/perl.module-authors/2016/10/msg10851.html
Ricardo Signes
For feedback during initial development.
Eily and Monk::Thomas
For diagnosis of regex problems in http://perlmonks.org/?node_id=1175983.
COPYRIGHT
Copyright (c) 2016 James E. Keenan. United States. All rights reserved. This is free software and may be distributed under the same terms as Perl itself.