NAME

gens.t - Unit tests for Test::LectroTest::Generator

SYNOPSIS

perl -Ilib t/gens.t

DESCRIPTION

Important: This test suite relies upon a number of randomized tests and statistical inferences. As a result, there is a small probability (about 1 in 200) that some part of the suite will fail even if everything is working properly. Therefore, if a test fails, re-run the test suite to determine whether the supposed problem is real or just a rare instance of the Fates poking fun at you.

This documentation is written mainly for programmers who maintain the test suite. If you are an end user of the LectroTest modules, you can stop reading now because otherwise you will be bored to tears.

Configuration

The $tsize variable determines how many trials to run durring the collection of distribution stats, mainly for the Int generator. The more trials you run, the smaller the deviations from the expected results you can detect. It is suggested that you do not change this value.

Fundamental tests

Here we sanity check that the fundamental object types can be created and that they have the right base class.

Generator tests

Here we test the generators. We perform the following tests.

Bool

The Bool distribution is really an Int distribution over the range [0,1]. Therefore, we make sure that it has a mean of 0.5.

Char

The Char distribution should return only the characters in the set we give it, and all of the characters in the set should be possible output values. First, we test to see that a trivial Char generator for a single character always returns that character.

Next, we make sure that a Char generator with a ten-character range generates all ten characters and does so with equal probability.

Next, we run a few tests to make sure that the parser for character set specifications work. We try the following: "a", "-", "a-a", "-a", "a-", "aA-C", "A-Ca":

Elements and OneOf

The Elements tests indirectly test OneOf, upon which the Elements generator is built. We ensure that the Elements distribution is complete and uniform.

We must also test the pre-flight check.

Float

The Float tests are modeled after the Int tests, but there are subtle differences in order to accomodate the differences between the underlying generators. In particular, Float has an (approximately) continuous distribution whereas Int has a discrete distribution.

First, we test seven Float generators having ranges 201 wide and centered around -300, -200, ... 200, 300. The generators are unsized (sized=>0) and thus should have means at the range centers.

Second, we test five more Float generators having ranges from [0,$span] where $span becomes increasingly large, finally equaling the configuration parameter $tsize. These generators are sized, and so we would expect the mean of their distributions to be equal to a weighted average of X1 and X2, where X1 is the mean of the equivalent un-sized distribution, and X2 is half of the mean of the sizing guidance over the range of values for which the sizing constrains the range.

Third, we repeat the above test, this time using balanced ranges [-$span,$span] for the same increasing progression of $span values. Because the range is balanced, as is the effect of sizing, the mean of the distributions must be zero.

Fourth, we run a series of unsized tests over 3-element ranges near zero. Because the ranges are so small, we expect that if there were off-by-one errors in the code, they would stand out here.

Fifth, we make sure that LectroTest prevents us from providing an empty range.

Sixth, we test the case where the generator is called without sizing guidance. In this case the full range is used.

Finally, we make sure that LectroTest prevents us from using a sized generator with a given range that does not contain zero.

Int

We must test Int hardcore because it is the generator upon which most others are built.

First, we test seven Int generators having ranges ten elements wide and centered around -3000, -2000, ... 2000, 3000. We ensure that each of the generators is complete and uniformly distributed.

Second, we test seven more Int generators having ranges 201 elements wide and centered around -300, -200, ... 200, 300. The generators are unsized (sized=>0) and thus should have means at the range centers.

Third, we test five more Int generators having ranges from [0,$span] where $span becomes increasingly large, finally equaling the configuration parameter $tsize. These generators are sized, and so we would expect the mean of their distributions to be equal to a weighted average of X1 and X2, where X1 is the mean of the equivalent un-sized distribution, and X2 is half of the mean of the sizing guidance over the range of values for which the sizing constrains the range.

Fourth, we repeat the above test, this time using balanced ranges [-$span,$span] for the same increasing progression of $span values. Because the range is balanced, as is the effect of sizing, the mean of the distributions must be zero.

Fifth, we run a series of unsized tests over 3-element ranges near zero. Because the ranges are so small, we expect that if there were off-by-one errors in the code, they would stand out here.

Sixth, we make sure that LectroTest prevents us from providing an empty range.

Seventh, we test the case where the generator is called without sizing guidance. In this case the full range is used.

Finally, we make sure that LectroTest prevents us from using a sized generator with a given range that does not contain zero.

Hash

Hash is a thin wrapper around List and so we need only a few Hash-specific tests to get good coverage.

Still, we need to test the pre-flight checks.

List

We consider four test cases to determine whether List respects its length modifier. First, we test the default list generation method, where list length is constrained only by the sizing guidance. For sizing guidance in [1..N], the expected mean generated list length is (1+N)/4.

Second, we test the length=>N variant. It should generate lists whose length always equals N.

Third, we test the length=>[M,] variant. For sizing guidance in [S..N], the expected mean of the distribution is given by the formula in the helper function clipped_triangle_mean(M,S,N). (Note that when M=0 this case is equivalent to the first case.)

Fourth, we test the length=>[M,N] variant. The expected mean generated list length is (M+N)/2, regardless of sizing guidance (which should be ignored in this case).

Fifth, we check to see if List's pre-flight checks catch common problems.

String

We consider four test cases to determine whether String respects its length modifier. These test cases are nearly identical to the four cases for the List generator. Because String is built on List, these tests are mostly redundant. However, it is a good idea to have them anyway because it frees us to change the implementation.

First, we test the default string generation method, where string length is constrained only by the sizing guidance. For sizing guidance in [1..N], the expected mean generated string length is (1+N)/4.

Second, we test the length=>N variant. It should generate strings whose length always equals N.

Third, we test the length=>[M,] variant. For sizing guidance in [S..N] we have the expected mean of the distribution is given by the formula in the helper function clipped_triangle_mean(M,S,N). (Note that when M=0, this test case is equivalent to the first.)

Fourth, we test the length=>[M,N] variant. The expected mean generated string length is (M+N)/2, regardless of sizing guidance (which should be ignored in this case).

Unit

The Unit generator is simple and always returns the same value. So we test it with three values: "a", 1, and 0.334.

Combinator tests

Here we test the combinators. We perform the following tests.

Frequency

We provide two tests of the Frequency combinator. First, we make sure that when all of the frequencies are identical the resulting distribution is complete and uniform. In effect, Frequency behaves like Elements for this case.

Second, we test that the frequencies are actually respected. When a sub-generator has a zero frequency, it should never be selected. We test this by creating a "yes" generator with frequency 1 and a "no" generator with frequency 0. We make sure that the combined Frequency generator generates only "yes" values. We run two variants of this test, one for each ordering of the two sub-generators.

Third, we check to make sure the pre-flight checks catch bad arguments.

Paste

To test the Paste generator, we create six Unit generators that return, respectively, the values "a".."f". Then we combine them in two ways via Paste combinators. The first does not use glue and thus should always generate "abcdef". The second uses the glue "-" and thus should always generate "a-b-c-d-e-f".

We also test to see that Paste handles Lists properly. It should concatenate the elements of all Lists and then paste them together with the other arguments.

Sized

We run two tests for the Sized combinator. First, we apply the constant-sizing Sized{1} combinator to a sized-Int generator over the range[-1,100]. If the combinator works properly, the sizing guidance passed to the Int generator will always be one, effectively clipping its range to [-1,1]. Thus we test that the mean of the resulting distribution is 0.

Second, we apply a "size-halving" combinator Sized{$_[0]/2} to the same Int generator as before and draw values from the combined generator for sizing values ranging from [1..200]. We expect the mean of the distribution of generated values should be equal to (-1 + 100) / 4.

Each

The Each combinator is just a wrapper around List, so the tests for it are simple.

Apply

Apply, in turn, is built upon Each, so we just make sure that it gets its own additional functionality right.

Map

Map is also built upon Each. Again, we just make sure it adds the correct twist.

Concat

Testing Concat is straightforward. We just feed it a few list generators and make sure it returns the right thing.

Flatten

Testing Flatten is like Concat, except here we must make sure that the resulting list does not contain any other lists.

ConcatMap

Testing ConcatMap is like testing Concat and Map together. (Who would have guessed?)

FlattenMap

Can you see where this is going? FlattenMap is just like Flatten and Map, together as best friends.

Helper functions

The test suite relies upon a few helper functions.

sample_distribution_z_score

This function takes an expected mean and a set of data values. It analyzes the data set to determine its mean M and standard deviation. Then it computes a z-score for the hypothesis that M is equal to the expected mean. The return value is the z-score.

dist_mean_ok

This function is used to determine if the mean of the distribution of values returned by a generator is equal to the expected mean. The generator is asked to generate one value for each element of sizing guidance given. The resulting values are passed through the given $numerizer function to convert them into numbers (useful if you are testing a String or Char generator). The name you are giving to the whole mean test should be passed in $name. This is passed to the Test::More cmp_ok function which records the result of the test.

complete_and_uniform_ok

This function determines whether the given generator $g returns values that are uniformly distributed across the complete range of values it is supposed to cover. In order for this test to function properly the generator must be designed to select from among ten distinct values. (E.g., Int(range=>[0,9]) is fine but not Int(range=>[1,100]).) The test draws 10,000 output values from the generator and then ensures that all ten @$expected_values are represented in the output and that all ten were selected with equal probability. The result of the test is reported via the Test::More ok function.

AUTHOR

Tom Moertel (tom@moertel.com)

COPYRIGHT and LICENSE

Copyright (C) 2004 by Thomas G Moertel. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

3 POD Errors

The following errors were encountered while parsing the POD:

Around line 1035:

=cut found outside a pod block. Skipping to next block.

Around line 1065:

=cut found outside a pod block. Skipping to next block.

Around line 1122:

=cut found outside a pod block. Skipping to next block.