NAME

Test::Weaken - Test that freed memory objects were, indeed, freed

SYNOPSIS

use Test::Weaken qw(leaks);
use Data::Dumper;
use Math::BigInt;
use Math::BigFloat;
use Carp;
use English qw( -no_match_vars );

my $good_test = sub {
    my $obj1 = new Math::BigInt('42');
    my $obj2 = new Math::BigFloat('7.11');
    [ $obj1, $obj2 ];
};

if ( !leaks($good_test) ) {
    print "No leaks in test 1\n" or croak("Cannot print to STDOUT: $ERRNO");
}
else {
    print "There were memory leaks from test 1!\n"
        or croak("Cannot print to STDOUT: $ERRNO");
}

my $bad_test = sub {
    my $array = [ 42, 711 ];
    push @{$array}, $array;
    $array;
};

my $bad_destructor = sub {'I am useless'};

my $tester = Test::Weaken::leaks(
    {   constructor => $bad_test,
        destructor  => $bad_destructor,
    }
);
if ($tester) {
    my $unfreed_proberefs = $tester->unfreed_proberefs();
    my $unfreed_count     = @{$unfreed_proberefs};
    printf "Test 2: %d of %d original references were not freed\n",
        $tester->unfreed_count(), $tester->probe_count()
        or croak("Cannot print to STDOUT: $ERRNO");
    print "These are the probe references to the unfreed objects:\n"
        or croak("Cannot print to STDOUT: $ERRNO");
    for my $proberef ( @{$unfreed_proberefs} ) {
        print Data::Dumper->Dump( [$proberef], ['unfreed'] )
            or croak("Cannot print to STDOUT: $ERRNO");
    }
}

DESCRIPTION

A memory leak occurs when an object is destroyed but the memory that the object uses is not completely deallocated. Leaked memory is a useless overhead. Leaks can significantly impact system performance. They can also cause an application to abend due to lack of memory.

In Perl, circular references are a common cause of memory leaks. Circular references are allowed in Perl, but objects containing circular references will leak memory unless the programmer takes specific measures to prevent leaks. Preventive measures include weakening the references and arranging to break the reference cycle just before the object is destroyed.

It is easy to misdesign or misimplement a scheme for preventing memory leaks. Mistakes of this kind have been hard to detect in a test suite.

Test::Weaken allows easy detection of unfreed memory objects. Test::Weaken allows you to examine the unfreed objects, even objects which are usually inaccessible. It performs this magic by creating a set of weakened probe references, as explained below.

Test::Weaken gets its test object from a closure. The closure should return a reference to the test object. This reference is called the test object reference.

Test::Weaken frees the test object, then looks to see if any memory that can be accessed from the test object reference was not actually deallocated. To determine which memory can be accessed from the test object reference, Test::Weaken follows arrays, hashes, weak references, and strong references. It follows these recursively and to unlimited depth.

Test::Weaken deals gracefully with circular references. That's important, because a major purpose of Test::Weaken is to test schemes for circular references. To avoid infinite loops, Test::Weaken records all the memory objects it visits, and will not visit the same memory object twice.

Tracked Objects

An object is called a independent memory object if it has independently allocated memory. For brevity, this document often refers to independent memory objects as independent objects.

Arrays, hashes, closures, and variables are independent memory objects. References and constants which are not elements of arrays or hashes are also independent memory objects. Elements of arrays and hashes are never independent memory objects, because their memory is not independent -- it is always deallocated when the array or hash to which the elements belong is destroyed.

A independent object is called a tracked object if Test::Weaken tracks it with a probe reference. Tracked objects are always independent objects.

Followed Objects

An object is called a followed object if Test::Weaken examines it during its recursive search for objects to track. Followed objects are not always independent objects. References are not independent objects when they are elements of arrays and hashes, but they are followed.

An object inside the test object is called an internal object. In the Test::Weaken context, the relevant criterion for deciding "inside" versus "outside" is the lifetime of an object. If an object's lifetime is expected to be the same as that of the test object, it is called an internal object. If an object's lifetime might be different from the lifetime of the test object, then it is called an external object. Since the question is one of expected lifetime, this difference is ultimately subjective.

Objects found recursively from the test object reference will usually be internal objects. This may not always be the case, however. Some objects found by Test::Weaken might be external to the test object. If external objects are found and they are persistent, they complicate matters.

An external object is called a persistent object if is expected that the lifetime of the external object might extend beyond that of the test object. Persistent objects are not memory leaks. With a persistent object, it is not expected that freeing the test object will always free the persistent object. With a memory leak, when the test object was freed, the leaked object was expected to be freed along with it, and this expectation was disappointed.

To determine which of the unfreed objects are memory leaks, the user must separate out the persistent objects from the other results. Ways to do this are outlined below.

Builtin Types

Builtin types are the type names returned by Scalar::Util's reftype subroutine. Scalar::Util::reftype differs from Perl's ref function. If an object was blessed into a package, ref returns the package name, while reftype returns the original builtin type of the object.

ARRAY and HASH Objects

Objects of builtin type ARRAY and HASH are always both tracked and followed.

REF Objects

Independent memory objects of builtin type REF are always both tracked and followed. Objects of type REF which are elements of an array or a hash are followed, but are not tracked.

CODE Objects

Objects of type CODE are tracked but are not followed. This can be seen as a limitation, because closures hold references to memory objects. Future versions of Test::Weaken may follow CODE objects.

SCALAR and VSTRING Objects

Independent objects of builtin types SCALAR and VSTRING are tracked. Objects of type SCALAR and VSTRING are independent if and only if they are not array or hash elements. SCALAR and VSTRING objects are not followed because there is nothing to follow -- they do not hold references to other objects.

Array and Hash Elements

Elements of arrays and hashes are never tracked, because they are not independent memory objects. If they are REF objects, they are followed.

Objects That are Ignored

An object is said to be ignored if it is neither tracked or followed. All objects of builtin types GLOB, IO, FORMAT and LVALUE are ignored. All array and hash elements which are not of builtin type REF are ignored.

Ignoring GLOB, IO and FORMAT objects saves trouble. These objects will almost always be external. GLOB objects refer to an entry in the Perl symbol table, which is external. Objects of builtin type IO are typically associated with GLOB objects. FORMAT objects are always global. Use of FORMAT objects is officially deprecated.

An LVALUE object could only be present in the test object through a reference. I have not seen LVALUE reference programming deprecated anywhere. Possibly nobody has found worth his breath to do so. LVALUE references are rare. Here's what one looks like:

\pos($string)

There is another reason that the user might be just as happy not to have FORMAT, IO and LVALUE references reported in the results. Data::Dumper does not handle them gracefully. Data::Dumper issues a cryptic warning when it encounters a reference to FORMAT, IO and LVALUE objects.

Future implementations of Perl may define builtin types not known as of this writing. Objects which do not fall into any of the types described above will not be tracked or followed.

Why the Test Object is Passed via a Closure

Test::Weaken gets its test object indirectly, as the return value from a test object constructor. Why so roundabout?

Because the indirect way is the easiest. When you create the test object in Test::Weaken's calling environment, it takes a lot of craft to avoid leaving unintended references to the test object in that calling environment. It is easy to get this wrong.

When the calling environment retains a reference to an object inside the test object, the result usually appears as a memory leak. In other words, mistakes in setting up the test object create memory leaks which are artifacts of the test environment. These artifacts are very difficult to sort out from the real thing.

The easiest way to avoid leaving unintended references to memory inside the test object is to work entirely within a closure, using only objects local to that closure. Memory objects local to a closure will be destroyed when the closure returns, and any references they held will be released. The closure-local strategy makes it relatively easy to be sure that nothing is left behind that will hold an unintended reference to memory inside the test object.

To help the user to follow the closure-local strategy, Test::Weaken requires that its test object reference be the return value of a closure. The closure-local strategy is safe. It is almost always right thing to do. Test::Weaken makes it the easy thing to do.

Nothing prevents a user from using a test object constructor that refers to data in global or other scopes. Nothing prevents a test object constructor from returning a reference to a test object created from data in any scope the user desires. Subverting the closure-local strategy takes little effort, certainly by comparison to the great amount of trouble that the user is exposing herself to.

Returns and Exceptions

The methods of Test::Weaken do not return errors. Errors are always thrown as exceptions.

PORCELAIN METHODS

leaks

use Test::Weaken;
use English qw( -no_match_vars );

my $tester = Test::Weaken::leaks(
    {   constructor => sub { new Buggy_Object },
        destructor  => \&destroy_buggy_object,
    }
);
if ($tester) {
    print "There are leaks\n" or croak("Cannot print to STDOUT: $ERRNO");
}

Returns a Perl false if no unfreed memory objects were detected. If unfreed memory objects were detected, returns an evaluated Test::Weaken class object.

Test::Weaken class objects, for brevity, are called testers. An evaluated tester is one on which the tests have been run, and for which results are available.

Users who only want to know if there were unfreed objects can test the return value of leaks for Perl true or false. Arguments to the leaks static method may be passed as a reference to a hash of named arguments, or directly as code references.

constructor

The test object constructor is a required argument. It must be a code reference. When the arguments are passed directly as code references, the test object constructor must be the first argument to leaks. When named arguments are used, the test object constructor must be the value of the constructor named argument.

The test object constructor should build the test object and return a reference to it. It is best to follow strictly the closure-local strategy, as described above.

destructor

The test object destructor is an optional argument. If specified, it must be a code reference. When the arguments are passed directly as code references, the test object destructor is the second, optional, argument to leaks. When named arguments are used, the test object destructor must be the value of the destructor named argument.

If specified, the test object destructor is called just before the test object reference is undefined. It will be passed one argument, the test object reference. The return value of the test object destructor is ignored.

Some test objects require a destructor to be called when they are freed. The primary purpose for the test object destructor is to enable Test::Weaken to work with these objects.

ignore
sub ignore_my_global {
    my ($thing) = @_;
    return ( Scalar::Util::blessed($thing) && $thing->isa('MyGlobal') );
}

my $tester = Test::Weaken::leaks(
    {   constructor => sub { MyObject->new },
        ignore      => \&ignore_my_global,
    }
);

The ignore argument is optional. It can be used to prevent Test::Weaken from following and tracking selected probe references, as chosen by the user. Use of the ignore argument should be avoided when possible. Filtering the probe references, as returned by unfreed_proberefs after the fact, is easier, safer and faster. The ignore argument is provided for situations where filtering after the fact is not practical. One such situation is when large or complicated sub-objects need to be filtered out of the results.

When specified, the value of the ignore argument must be a reference to a callback subroutine. The subroutine will be called once for each probe reference, with that probe reference as the only argument. Everything that is referred to, directly or indirectly, by this probe reference should be left unchanged by the ignore callback. The result of modifying the probe referents might be an exception, an abend, an infinite loop, or erroneous results.

The callback subroutine should return Perl true if the probe reference is to an object that should be ignored -- that is, neither followed or tracked. Otherwise the callback subroutine should return a Perl false.

For safety, Test::Weaken does not pass the original probe reference to the ignore callback. Instead, Test::Weaken passes a copy of the probe reference. This prevents the user altering the probe reference itself. The object referred to by the probe reference is not copied. It is still the original object. If that is altered, all bets are off.

ignore callbacks are best kept simple. Defer as much of the analysis as you can until after the test is completed. ignore callbacks can also be a significant overhead. The ignore callback is invoked once per probe reference.

Test::Weaken offers some help in debugging ignore callback subroutines. See below.

unfreed_proberefs

use Test::Weaken;
use English qw( -no_match_vars );

my $tester = Test::Weaken::leaks( sub { new Buggy_Object } );
if ($tester) {
    my $unfreed_proberefs = $tester->unfreed_proberefs();
    my $unfreed_count     = @{$unfreed_proberefs};
    printf "%d of %d references were not freed\n",
        $tester->unfreed_count(), $tester->probe_count()
        or croak("Cannot print to STDOUT: $ERRNO");
    print "These are the probe references to the unfreed objects:\n"
        or croak("Cannot print to STDOUT: $ERRNO");
    for my $proberef ( @{$unfreed_proberefs} ) {
        print Data::Dumper->Dump( [$proberef], ['unfreed'] )
            or croak("Cannot print to STDOUT: $ERRNO");
    }
}

Returns a reference to an array of probe references to the unfreed objects. Throws an exception if there is a problem, for example if the Test::Weaken object has not yet been evaluated.

Often, this data is examined to pinpoint the source of a leak. A user may also analyze this data to produce her own statistics about unfreed objects.

The array is returned as a reference because in some applications it can be quite long. The array contains the probe references to the unfreed independent memory objects.

The array contains probe references rather than the objects themselves, because it is not always possible to copy the independent objects directly into the array. Arrays and hashes cannot be copied into individual array elements -- references to them are the best that can be done.

Even when copying is possible, it destroys important information. The original address of the copied object may be important for identifying it, and the copy will have a different address. And weak references are strengthened when they are copied.

unfreed_count

use Test::Weaken;
use English qw( -no_match_vars );

my $tester = Test::Weaken::leaks( sub { new Buggy_Object } );
next TEST if not $tester;
printf "%d objects were not freed\n", $tester->unfreed_count(),
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the count of unfreed objects. This count will be exactly the length of the array referred to by the return value of the unfreed_proberefs method. Throws an exception if there is a problem, for example if the Test::Weaken object has not yet been evaluated.

probe_count

use Test::Weaken;
use English qw( -no_match_vars );

my $tester = Test::Weaken::leaks(
    {   constructor => sub { new Buggy_Object },
        destructor  => \&destroy_buggy_object,
    }
);
next TEST if not $tester;
printf "%d of %d objects were not freed\n",
    $tester->unfreed_count(), $tester->probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the total number of probe references in the test, including references to freed objects. This is the count of probe references after Test::Weaken was finished following the test object reference recursively, but before Test::Weaken called the test object destructor and undefined the test object reference. Throws an exception if there is a problem, for example if the Test::Weaken object has not yet been evaluated.

PLUMBING METHODS

Most users can skip this section. The plumbing methods exist to satisfy object-oriented purists, and to accommodate the rare user who wants to access the probe counts even when the test did find any unfreed objects.

new

use Test::Weaken;
use English qw( -no_match_vars );

my $tester        = new Test::Weaken( sub { new My_Object } );
my $unfreed_count = $tester->test();
my $proberefs     = $tester->unfreed_proberefs();
printf "%d of %d objects freed\n",
    $unfreed_count,
    $tester->probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");

The new method takes the same arguments as the leaks method, described above. Unlike the leaks method, it always returns an unevaluated tester. An unevaluated tester is one on which the test has not yet been run and for which results are not yet available. If there are any problems, the new method throws an exception.

The test method is the only method which can be called successfully on an unevaluated tester. Calling any other method on an unevaluated tester causes an exception to be thrown.

test

use Test::Weaken;
use English qw( -no_match_vars );

my $tester = new Test::Weaken(
    {   constructor => sub { new My_Object },
        destructor  => \&destroy_my_object,
    }
);
printf "There are %s\n", ( $tester->test() ? 'leaks' : 'no leaks' )
    or croak("Cannot print to STDOUT: $ERRNO");

Converts an unevaluated tester into an evaluated tester. It does this by performing the test specified by the arguments to the new constructor and recording the results. Throws an exception if there is a problem, for example if the tester had already been evaluated.

The test method returns the count of unfreed objects. This will be identical to the length of the array returned by unfreed_proberefs and the count returned by unfreed_count.

ADVANCED TECHNIQUES

Tracing Leaks

The unfreed_proberefs method returns an array containing probes to the unfreed independent memory objects. This can be used to find the source of leaks. If circumstances allow it, you might find it useful to add "tag" elements to arrays and hashes to aid in identifying the source of a leak.

You can quasi-uniquely identify memory objects using the referent addresses of the probe references. A referent address can be determined by using the refaddr method of Scalar::Util. You can also obtain the referent address of a reference by adding zero to the reference.

Note that in other Perl documentation, the term "reference address" is often used when a referent address is meant. Any given reference has both a reference address and a referent address. The reference address is the reference's own location in memory. The referent address is the address of the memory object to which the reference refers. It is the referent address that interests us here and, happily, it is the referent address that both zero addition and refaddr return.

Sometimes, when you are interested in why an object is not being freed, you want to seek out the reference that keeps the object's refcount above zero. Kevin Ryde reports that Devel::FindRef can be useful for this.

Quasi-unique addresses and Indiscernable Objects

I call referent addresses "quasi-unique", because they are only unique at a specific point in time. Once an object is freed, its address can be reused. Absent other evidence, an object with a given referent address is not 100% certain to be the same object as the object which had the same address earlier. This can bite you if you're not careful.

To be sure an earlier object and a later object with the same address are actually the same object, you need to know that the earlier object will be persistent, or to compare the two objects. If you want to be really pedantic, even an exact match from a comparison doesn't settle the issue. It is possible that two indiscernable (that is, completely identical) objects with the same referent address are different in the following sense: the first object might have been destroyed and a second, identical, object created at the same address. But for most practical programming purposes, two indiscernable objects can be regarded as the same object.

Debugging Ignore Subroutines

It can be hard to determine if ignore callback subroutines are inadvertently modifying the test object. The Test::Weaken::check_ignore static method is provided to make this task easier.

$tester = Test::Weaken::leaks(
    {   constructor => sub { MyObject->new },
        ignore => Test::Weaken::check_ignore( \&ignore_my_global ),
    }
);
$tester = Test::Weaken::leaks(
    {   constructor => sub { DeepObject->new },
        ignore      => Test::Weaken::check_ignore(
            \&cause_deep_problem, 99, 0, $reporting_depth
        ),
    }
);

Test::Weaken::check_ignore is a static method which constructs a debugging wrapper from four arguments, three of which are optional. The first argument must be the ignore callback which you are trying to debug. This callback is called the test subject, or lab rat.

The second, optional argument, is the maximum error count. Below this count, errors are reported as warnings using carp. When the maximum error count is reached, an exception is thrown using croak. The maximum error count, if defined, must be an number greater than or equal to 0. By default the maximum error count is 1, which means that the first error will be thrown as an exception.

If the maximum error count is 0, all errors will be reported as warnings and no exception will ever be thrown. Infinite loops are a common behavior of buggy lab rats, and setting the maximum error count to 0 will usually not be something you want to do.

The third, optional, argument is the compare depth. It is the depth to which the probe referents will be checked, as described below. It must be a number greater than or equal to zero. If the compare depth is zero, the probe referent is checked to unlimited depth. By default the compare depth is 0.

This fourth, optional, argument is the reporting depth. It is the depth to which the probe referents are dumped in check_ignore's error messages. It must be a number greater than or equal to -1. If the reporting depth is zero, the object is dumped to unlimited depth. If the reporting depth is -1, there is no dump in the error message. By default, the reporting depth is -1.

Test::Weaken::check_ignore returns a reference to the wrapper callback. If no problems are detected, the wrapper callback behaves exactly like the lab rat callback, except that the wrapper is slower.

To discover when and if the lab rat callback is altering its arguments, Test::Weaken::check_ignore compares the test object before the lab rat is called, to the test object after the lab rat returns. Test::Weaken::check_ignore compares the before and after test objects in two ways. First, it dumps the contents of each test object using Data::Dumper. For comparison purposes, the dump using Data::Dumper is performed with Maxdepth set to the compare depth as described above. Second, if the immediate probe referent has builtin type REF, Test::Weaken::check_ignore determines whether the immediate probe referent is a weak reference or a strong one.

If either comparison shows a difference, the wrapper treats it as a problem, and produces an error message. This error message is either a carp warning or a croak exception, depending on the number of error messages already reported and the setting of the maximum error count. If the reporting depth is a non-negative number, the error message includes a dump from Data::Dumper of the test object. Data::Dumper's Maxdepth for reporting purposes is the reporting depth as described above.

A user who wants other features, such as deep checking of the test object for strengthened references, can easily modify Test::Weaken::check_ignore. Test::Weaken::check_ignore is a static method which does not use any Test::Weaken package resources. It is easy to copy it from the Test::Weaken source and hack it up. The hacked version can reside anywhere, and does not need to be part of the Test::Weaken package.

EXPORTS

By default, Test::Weaken exports nothing. Optionally, leaks may be exported.

IMPLEMENTATION

Test::Weaken first recurses through the test object. Starting from the test object reference, it follows and tracks objects recursively, as described above. The test object is explored to unlimited depth, looking for independent memory objects to track. Independent objects visited during the recursion are recorded, and no object is visited twice. For each independent memory object, a probe reference is created.

Once recursion through the test object is complete, the probe references are weakened. This prevents the probe references from interfering with the normal deallocation of memory. Next, the test object destructor is called, if there is one.

Finally, the test object reference is undefined. This should trigger the deallocation of all memory held by the test object. To check that this happened, Test::Weaken dereferences the probe references. If the referent of a probe reference was deallocated, the value of that probe reference will be undef. If a probe reference is still defined at this point, it refers to an unfreed independent object.

AUTHOR

Jeffrey Kegler

BUGS

Please report any bugs or feature requests to bug-test-weaken at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Weaken. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Test::Weaken

You can also look for information at:

SEE ALSO

Potential users will want to compare Test::Memory::Cycle and Devel::Cycle, which examine existing structures non-destructively. Devel::Leak also covers similar ground, although it requires Perl to be compiled with -DDEBUGGING in order to work. Devel::Cycle looks inside closures if PadWalker is present, a feature Test::Weaken does not have at present.

ACKNOWLEDGEMENTS

Thanks to jettero, Juerd and perrin of Perlmonks for their advice. Thanks to Lincoln Stein (developer of Devel::Cycle) for test cases and other ideas.

After the first release of Test::Weaken, Kevin Ryde made several important suggestions and provided test cases. These provided the impetus for version 2.000000.

LICENSE AND COPYRIGHT

Copyright 2007-2009 Jeffrey Kegler, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.