NAME

Test::Weaken - Test that freed references are, indeed, freed

SYNOPSIS

use Test::Weaken qw(leaks);
use Data::Dumper;
use Math::BigInt;
use Math::BigFloat;
use Carp;
use English qw( -no_match_vars );

my $good_test = sub {
    my $obj1 = new Math::BigInt('42');
    my $obj2 = new Math::BigFloat('7.11');
    [ $obj1, $obj2 ];
};

my $bad_test = sub {
    my $array = [ 42, 711 ];
    push @{$array}, $array;
    $array;
};

my $bad_destructor = sub {'I am useless'};

if ( !leaks($good_test) ) {
    print "No leaks in test 1\n" or croak("Cannot print to STDOUT: $ERRNO");
}
else {
    print "There were memory leaks from test 1!\n"
        or croak("Cannot print to STDOUT: $ERRNO");
}

my $test = Test::Weaken::leaks(
    {   constructor => $bad_test,
        destructor  => $bad_destructor,
    }
);
if ($test) {
    my $unfreed_proberefs = $test->unfreed_proberefs();
    my $unfreed_count     = @{$unfreed_proberefs};
    printf "Test 2: %d of %d original references were not freed\n",
        $test->unfreed_count(), $test->probe_count()
        or croak("Cannot print to STDOUT: $ERRNO");
    print "These are the probe references to the unfreed objects:\n"
        or croak("Cannot print to STDOUT: $ERRNO");
    for my $proberef ( @{$unfreed_proberefs} ) {
        print Data::Dumper->Dump( [$proberef], ['unfreed'] )
            or croak("Cannot print to STDOUT: $ERRNO");
    }
}

DESCRIPTION

A memory leak occurs when an object is destroyed but the memory that the object uses is not completely deallocated. Leaked memory is a useless overhead. Leaks can significantly impact system performance. They can also cause an application to abend due to lack of memory.

In Perl, circular references are a common cause of memory leaks. Circular references are allowed in Perl, but objects containing circular references will leak memory unless the programmer takes specific measures to prevent leaks. Preventive measures include weakening the references and arranging to break the reference cycle just before the object is destroyed,

It is easy to misdesign or misimplement a scheme for preventing memory leaks. Mistakes of this kind have been hard to detect in a test suite.

Test::Weaken exists to allow easy detection of unfreed memory objects. Test::Weaken allows you to examine the unfreed objects, even objects which are usually inaccessible. It performs this magic by creating a set of weakened probe references, as explained below.

Test::Weaken gets its test object from a closure. The closure should return a reference to the test object. This reference is called the test object reference.

Test::Weaken frees the test object, then looks to see if any memory that can be accessed from the test object reference was not actually deallocated. To determine which memory can be accessed from the test object reference, Test::Weaken follows arrays, hashes, weak references and strong references. It follows these recursively and to unlimited depth.

Test::Weaken deals gracefully with circular references. That's important, because a major purpose of Test::Weaken is to test schemes for circular references. To avoid infinite loops, Test::Weaken records all the memory objects it visits, and will not visit the same memory object twice.

Independent Objects and Tracked Objects

An object is called a independent memory object, if it has independently allocated memory. For brevity, this document often refers to independent memory objects as independent objects.

Arrays, hashes, closures and variables are independent memory objects. References and constants which are not elements of arrays or hashes are also independent memory objects. Elements of arrays and hashes are never independent memory objects, because their memory is not independent -- it is always deallocated when the array or hash to which they belong is destroyed.

A independent object is called a tracked object if Test::Weaken tracks it with a probe reference. Tracked objects are always independent objects.

Followed Objects and External Objects

An object is called a followed object if Test::Weaken examines it during its recursive search for objects to track. Followed objects are not always independent objects. References are not independent objects when they are elements of arrays and hashes, but they are followed.

An object inside the test object is called an internal object. In the Test::Weaken context, the relevant criterion for deciding "inside" versus "outside" is the lifetime of an object. If an object's lifetime is expected to be the same as that of the test object, it is called an internal object. If an object's lifetime might be different from the lifetime of the test object, then it is called an external object. Since the question is one of expected lifetime, this difference is ultimately subjective.

Objects found recursively from the test object reference will usually be internal objects. This may not always be the case, however. Some objects found by Test::Weaken might be external to the test object. If external objects are found and they are persistent, they complicate matters.

An external object is called a persistent object, if is expected that the lifetime of the external object might extend beyond that of the test object, Persistent objects are not memory leaks. Persistent objects are objects not expected to be freed along with the test object. Leaked objects are objects which are expected to be freed with the test object, but which are not.

To determine which of the unfreed objects are memory leaks, the user must separate out the persistent objects from the other results. Ways to do this are outlined below.

Builtin Types

When it needs to classify object types precisely, this document will use the builtin type names as returned by Scalar::Util's reftype subroutine and Perl's ref built-in. Both reftype and ref take a reference as their argument. They both return the type of the referent. For example, given a reference to a number or a string, reftype and ref return "SCALAR". If the argument to reftype is a reference to a reference, reftype and ref return "REF".

There are differences between reftype and ref. For an object blessed into a package, ref returns the package name, rather than the builtin type. Also, ref describes the builtin type of compiled regular expressions as "Regexp", while reftype describes it as being of the same builtin type as a number or a string: "SCALAR". In this document, the list of builtin types is considered to be as follows: SCALAR, ARRAY, HASH, CODE, REF, GLOB, LVALUE, FORMAT, IO, VSTRING, and Regexp.

ARRAY and HASH Objects

Objects of builtin type ARRAY and HASH are always both tracked and followed.

REF Objects

Independent memory objects of builtin type REF are always both tracked and followed. Objects of type REF which are elements of an array or a hash are followed, but are not tracked.

CODE Objects

Objects of type CODE are tracked but are not followed. This can be seen as a limitation, because closures hold references to memory objects. Future versions of Test::Weaken may follow CODE objects.

SCALAR, VSTRING and Regexp Objects

Independent objects of builtin types SCALAR, VSTRING and Regexp are tracked. Objects of type SCALAR, VSTRING and Regexp are independent if and only if they are not array or hash elements. SCALAR, VSTRING and Regexp objects are not followed because there is nothing to follow -- they do not hold references to other objects.

Array and Hash Elements

Elements of arrays and hashes are never tracked, because they are not independent memory objects. If they are REF objects, they are followed.

Objects That are Ignored

An object is said to be ignored if it is neither tracked or followed. All objects of builtin types GLOB, IO, FORMAT and LVALUE are ignored. All array and hash elements which are not of builtin type REF are ignored.

Ignoring GLOB, IO and FORMAT objects saves trouble. These objects will almost always be external. GLOB objects refer to an entry in the Perl symbol table, which is external. Objects of builtin type IO are typically associated with GLOB objects. FORMAT objects are always global. Use of FORMAT objects is officially deprecated.

An LVALUE object could only be present in the test object through a reference. I have not seen LVALUE reference programming deprecated anywhere. Possibly nobody has found worth his breath to do so. LVALUE references are rare. Here's what one looks like

\pos($string)

Another reason that the user might be just as happy not to have FORMAT, IO and LVALUE references reported in the results, is that Data::Dumper does not handle them gracefully. Data::Dumper issues a cryptic warning when it encounters a reference to FORMAT, IO and LVALUE objects.

Future implementations of Perl may define builtin types not known as of this writing. Objects which do not fall into any of the types described above will not be tracked or followed.

Why the Test Object is Passed via a Closure

Test::Weaken does not accept test objects or references to them as arguments. Instead, Test::Weaken receives its test objects indirectly, from test object constructors.

Why so roundabout? Because the indirect way is the easiest. When you create the test object in Test::Weaken's calling environment, it takes a lot of craft to avoid leaving unintended references to the test object in that calling environment. It is easy to get this wrong.

If the calling environment retains a reference to an object inside the test object, the result appears as a memory leak. In other words, mistakes in setting up the test object create memory leaks which are artifacts of the test environment. These artifacts are very difficult to sort out from the real thing.

The easiest way to avoid leaving unintended references to memory inside the test object is to work entirely within a closure. Under this strategy, the test object is created entirely in the closure, using only objects local to that closure. Memory objects local to a closure will be destroyed when the closure returns, and any references they held will be released. The closure-local strategy makes it relatively easy to be sure that nothing is left behind that will hold an unintended reference to memory inside the test object.

To help the user to follow the closure-local strategy, Test::Weaken requires that its test object reference be the return value of a closure. The closure-local strategy is safe. It is almost always right thing to do. Test::Weaken makes it the easy thing to do.

Nothing prevents a user from using a test object constructor that refers to data in global or other scopes. Nothing prevents a test object constructor from returning a reference to a test object created from data in any scope the user desires. Subverting the closure-local strategy takes little effort, certainly by comparison to the great amount of trouble that the user is exposing herself to.

Returns and Exceptions

The methods of Test::Weaken do not return errors. Errors are always thrown as exceptions.

PORCELAIN METHODS

leaks

use Test::Weaken;
use English qw( -no_match_vars );

my $test = Test::Weaken::leaks(
    {   constructor => sub { new Buggy_Object },
        destructor  => \&destroy_buggy_object,
    }
);
if ($test) {
    print "There are leaks\n" or croak("Cannot print to STDOUT: $ERRNO");
}

Arguments to the leaks static method may be passed as a reference to a hash of named arguments, or directly as code references. leaks returns a Test::Weaken object if it found unfreed objects, and a Perl false value otherwise. Users who only want to know if there were unfreed objects can test the return value of leaks for Perl true or false.

constructor

The test object constructor is a required argument. It must be a code reference. If passed directly, it must be the first argument to leaks. Otherwise, it must be the value of the constructor named argument.

The test object constructor should build the test object and return a reference to it. It is best to follow strictly the closure-local strategy, as described above.

destructor

The test object destructor is an optional argument. If specified, it must be a code reference. If passed directly, it must be the second argument to leaks. Otherwise, it must be the value of the destructor named argument.

If specified, the test object destructor is called just before the test object reference is undefined. It will be passed one argument, the test object reference. The return value of the test object destructor is ignored.

Some objects which are of interest as test objects require a destructor to be called when they are freed. For example, some objects created by Gtk2-Perl are of this type. The primary purpose for the test object destructor is to enable Test::Weaken to work with these objects.

unfreed_proberefs

use Test::Weaken;
use English qw( -no_match_vars );

my $test = Test::Weaken::leaks( sub { new Buggy_Object } );
if ($test) {
    my $unfreed_proberefs = $test->unfreed_proberefs();
    my $unfreed_count     = @{$unfreed_proberefs};
    printf "%d of %d references were not freed\n",
        $test->unfreed_count(), $test->probe_count()
        or croak("Cannot print to STDOUT: $ERRNO");
    print "These are the probe references to the unfreed objects:\n"
        or croak("Cannot print to STDOUT: $ERRNO");
    for my $proberef ( @{$unfreed_proberefs} ) {
        print Data::Dumper->Dump( [$proberef], ['unfreed'] )
            or croak("Cannot print to STDOUT: $ERRNO");
    }
}

Returns a reference to an array of probe references to the unfreed objects. Typically, this data can be examined to pinpoint the source of a leak. A user may also analyze this data to produce her own statistics about unfreed objects.

The array is returned as a reference because in some applications it can be quite long. The array contains the probe references to the unfreed independent memory objects.

The array contains probe references rather than the objects themselves, because it is not always possible to copy the independent objects into the array. Arrays and hashes cannot be copied into individual array elements -- references to them are the best that can be done.

Even when copying is possible, it can destroy important information. Weak references are strengthened when they are copied. The original address of the copied object may be important for identifying it, and the copy will have a different address.

unfreed_count

use Test::Weaken;
use English qw( -no_match_vars );

my $test = Test::Weaken::leaks( sub { new Buggy_Object } );
next TEST if not $test;
printf "%d objects were not freed\n", $test->unfreed_count(),
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the count of unfreed objects. This count will be exactly the length of the array referred to by the return value of the unfreed_proberefs method.

probe_count

use Test::Weaken;
use English qw( -no_match_vars );

my $test = Test::Weaken::leaks(
    {   constructor => sub { new Buggy_Object },
        destructor  => \&destroy_buggy_object,
    }
);
next TEST if not $test;
printf "%d of %d objects were not freed\n",
    $test->unfreed_count(), $test->probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the total number of probe references in the test, including references to freed objects. This is the count of probe references after Test::Weaken was finished following the test object reference recursively, but before Test::Weaken called the test object destructor and undefined the test object reference.

weak_probe_count

use Test::Weaken;
use Scalar::Util qw(isweak);
use English qw( -no_match_vars );

my $test = Test::Weaken::leaks( sub { new Buggy_Object }, );
next TEST if not $test;
my $weak_unfreed_reference_count =
    scalar grep { ref $_ eq 'REF' and isweak( ${$_} ) }
    @{ $test->unfreed_proberefs() };
printf "%d of %d weak references were not freed\n",
    $weak_unfreed_reference_count, $test->weak_probe_count(),
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the number of probe references which were to weak references. The count includes probe references to freed weak references. The count is made after Test::Weaken finishes following the test object reference recursively, but before Test::Weaken calls the test object destructor and undefines the test object reference.

strong_probe_count

use Test::Weaken;
use English qw( -no_match_vars );
use Scalar::Util qw(isweak);

my $test = Test::Weaken::leaks(
    {   constructor => sub { new Buggy_Object },
        destructor  => \&destroy_buggy_object,
    }
);
next TEST if not $test;
my $proberefs = $test->unfreed_proberefs();
my $strong_unfreed_object_count =
    grep { ref $_ ne 'REF' or not isweak( ${$_} ) } @{$proberefs};
my $strong_unfreed_reference_count =
    grep { ref $_ eq 'REF' and not isweak( ${$_} ) } @{$proberefs};

printf "%d of %d strong objects were not freed\n",
    $strong_unfreed_object_count, $test->strong_probe_count(),
    or croak("Cannot print to STDOUT: $ERRNO");
printf "%d of the unfreed strong objects were references\n",
    $strong_unfreed_reference_count
    or croak("Cannot print to STDOUT: $ERRNO");

Returns the number of probe references which were to strong objects. Here "strong object" means any object which is not a weak reference. This includes not just strong REF objects, but all objects which are not of REF type.

The count includes probes to both freed and unfreed objects. The count is taken after Test::Weaken finishes following the test object reference recursively, but before Test::Weaken calls the test object destructor and undefines the test object reference.

PLUMBING METHODS

Most users can skip this section. The plumbing methods exist to satisfy object-oriented purists, and to accommodate the rare user who wants to access the probe counts even when the test did find any unfreed objects.

new

use Test::Weaken;
use English qw( -no_match_vars );

my $test = new Test::Weaken( sub { new My_Object } );
printf "There were %s leaks\n", $test->test()
    or croak("Cannot print to STDOUT: $ERRNO");
my $proberefs            = $test->unfreed_proberefs();
my $unfreed_count        = 0;
my $weak_unfreed_count   = 0;
my $strong_unfreed_count = 0;
PROBEREF: for my $proberef ( @{$proberefs} ) {
    $unfreed_count++;
    if ( ref $_ eq 'REF' and isweak( ${$_} ) ) {
        $weak_unfreed_count++;
    }
    else {
        $strong_unfreed_count++;
    }
}
printf "%d of %d objects freed\n", $unfreed_count, $test->probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");
printf "%d of %d weak references freed\n", $weak_unfreed_count,
    $test->weak_probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");
printf "%d of %d other objects freed\n", $strong_unfreed_count,
    $test->strong_probe_count()
    or croak("Cannot print to STDOUT: $ERRNO");

The new method takes the same arguments as the leaks method, described above. Unlike the leaks method, it always returns a test object. Errors are thrown as exceptions.

The test object returned by new will only have been initialized. The only thing that can be done with this test object is to call the test method on it. Until the test method is called, no results will be available, and calling any method which asks for a result will cause an exception.

test

use Test::Weaken;
use English qw( -no_match_vars );

my $test = new Test::Weaken(
    {   constructor => sub { new My_Object },
        destructor  => \&destroy_my_object,
    }
);
printf "There are %s\n", ( $test->test() ? 'leaks' : 'no leaks' )
    or croak("Cannot print to STDOUT: $ERRNO");

The test method should only be called on a Test::Weaken object just returned from the new constructor. It causes the test specified in the constructor to be run and the results to be obtained. Calling any other method except test on a Test::Weaken object returned by the new constructor, but not yet evaluated with the test method, will produce an exception.

The test method returns the count of unfreed objects. This will be identical to the length of the array returned by unfreed_proberefs and the count returned by unfreed_count. If there is an error, the test method throws an exception.

ADVANCED TECHNIQUES

Tracing Leaks

The unfreed_proberefs method returns an array containing the unfreed independent memory objects. This can be used to find the source of leaks. If circumstances allow it, you might find it useful to add "tag" elements to arrays and hashes for tracking purposes.

You can quasi-uniquely identify memory objects using the referent addresses of the probe references. A referent address can be determined by using the refaddr method of Scalar::Util. You can also obtain the referent address of a reference by adding zero to the reference.

I called referent addresses "quasi-unique", because they are only unique at a specific point in time. Once an object is freed, its address can be reused. This is an unusual, corner case, but it can bite you if you're not careful. Absent other evidence, an object with the same referent address as an object examined earlier is not 100% certain to be the same object.

To be sure an earlier object and a later object with the same address are actually the same object, you need to know that the earlier object will be persistent, or to compare the two objects. If you want to be really pedantic, even an exact match in a comparison doesn't settle the issue. It is possible that two indiscernable (that is, completely identical) objects with the same referent address are different in the following sense: the first object might have been destroyed and a second, identical, object created at the same address. But for most practical programming purposes, two indiscernable objects can be regarded as the same object.

Note that in other Perl documentation, the term "reference address" is often used when a referent address is meant. Any given reference has both a reference address and a referent address. The reference address is the reference's own location in memory. The referent address is the address of the memory object to which it refers. It is the referent address that interests us here and, happily, it is the referent address that addition of zero and refaddr return.

Sometimes, when you are interested in why an object is not being freed, you want to seek out the reference that keeps the object's refcount above zero. Kevin Ryde reports that Devel::FindRef can be useful for this.

EXPORTS

By default, Test::Weaken exports nothing. Optionally, leaks may be exported.

IMPLEMENTATION

Test::Weaken first recurses through the test object. Starting from the test object reference, it follows and tracks objects recursively, as described above. The test object is explored to unlimited depth, looking for independent memory objects to track. Independent objects visited during the recursion are recorded, and no object is visited twice. For each independent memory object, a probe reference is created.

Once recursion through the test object is complete, the probe references are weakened. This prevents the probe references from interfering with the normal deallocation of memory. Next, the test object destructor is called, if there is one.

Finally, the test object reference is undefined. This should trigger the deallocation of all memory held by the test object. To check that this happened, Test::Weaken dereferences the probe references. If the referent of a probe reference was deallocated, the value of that probe reference will be undef. If a probe reference is still defined at this point, it refers to an unfreed independent object.

AUTHOR

Jeffrey Kegler

BUGS

Please report any bugs or feature requests to bug-test-weaken at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Test-Weaken. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Test::Weaken

You can also look for information at:

SEE ALSO

Potential users will want to compare Test::Memory::Cycle and Devel::Cycle, which examine existing structures non-destructively. Devel::Leak also covers similar ground, although it requires Perl to be compiled with -DDEBUGGING in order to work. Devel::Cycle looks inside closures if PadWalker is present, a feature Test::Weaken does not have at present.

ACKNOWLEDGEMENTS

Thanks to jettero, Juerd and perrin of Perlmonks for their advice. Thanks to Lincoln Stein (developer of Devel::Cycle) for test cases and other ideas.

After the first release of Test::Weaken, Kevin Ryde made several important suggestions and provided test cases. These provided the impetus for version 2.000000.

LICENSE AND COPYRIGHT

Copyright 2007-2009 Jeffrey Kegler, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl 5.10.