NAME
List::Compare - Simple object-oriented implementation of standard Perl code for comparing elements of two lists
VERSION
This document refers to version 0.11 of List::Compare. This version was released June 20, 2002.
SYNOPSIS
Create a List::Compare object. Put the 2 lists into arrays and pass references to the arrays to the constructor.
@Alist = qw(alpha beta beta gamma delta epsilon);
@Blist = qw(gamma delta delta epsilon zeta eta);
$cl = List::Compare->new(\@Alist, \@Blist);
Get those items which appear only in the first list by using either of the following ($get_Aonly
is just an alias for &get_unique
):
@Aonly = $cl->get_unique;
@Aonly = $cl->get_Aonly;
Get those items which appear only in the second list by using either of the following ($get_Bonly
is just an alias for &get_complement
):
@Bonly = $cl->get_complement;
@Bonly = $cl->get_Bonly;
Get those items which appear in both lists (their intersection):
@intersection = $cl->get_intersection;
Get those items which appear in either list (their union):
@union = $cl->get_union;
Get those items which appear in either the first or the second list, but not both, by using any of the following (&get_symdiff
and $get_AorBonly
are just aliases for $get_symmetric_difference
):
@AorBonly = $cl->get_symmetric_difference;
@AorBonly = $cl->get_symdiff;
@AorBonly = $cl->get_AorBonly;
Make a bag of all those items in both lists. The bag differs from the union of the two lists in that it holds as many copies of individual elements as appear in the original lists:
@bag = $cl->get_bag;
Return a true value if A is a subset of B:
$AB = $cl->is_AsubsetB;
Return a true value if B is a subset of A:
$BA = $cl->is_BsubsetA;
Return current List::Compare version number:
$vers = $cl->get_version;
DESCRIPTION
List::Compare is a simple, object-oriented implementation of very common Perl code (see "History, References and Development" below) used to determine interesting relationships between two lists at a time. A List::Compare object is created and automatically computes the values needed to supply List::Compare methods with appropriate results. In the current implementation List::Compare methods will return new lists containing the items found in each list alone, in either list but not both (symmetric difference), the intersection and union of the two lists, the ''bag'' comprised of both lists without eliminating duplicates and Boolean values indicating whether one list is a subset of the other.
In its current implementation List::Compare, with one exception, generates its results by means of hash look-up tables. Hence, multiple instances of an element in a given list only count once with respect to computing the intersection, union, etc. of the two lists. Only when we use get_bag
to compute a bag holding the two lists do we store duplicate values.
ASSUMPTIONS AND QUALIFICATIONS
The program was created with Perl 5.6. The use of h2xs to prepare the module's template installed the require 5.005_62;
at the top of the module. In a future release the author will try to make it more backwardly compatible so that, inter alia, it can run on older versions of MacPerl. As is, the module has been successfully installed on Linux (RedHat 7.2, Perl 5.6.0), Windows (ActivePerl 5.6.1) and Cygwin Perl.
HISTORY, REFERENCES AND DEVELOPMENT
The Code Itself
List::Compare is based on code presented by Tom Christiansen & Nathan Torkington in Perl Cookbook http://www.oreilly.com/catalog/cookbook/ (a.k.a. the 'Ram' book), O'Reilly & Associates, 1998, Recipes 4.7 and 4.8. Similar code is presented in the Camel book: Programming Perl, by Larry Wall, Tom Christiansen, Jon Orwant. http://www.oreilly.com/catalog/pperl3/, 3rd ed, O'Reilly & Associates, 2000. The list comparison code is so basic and Perlish that I suspect it may have been written by Larry himself at the dawn of Perl time. All I've done is to put it in an object-oriented framework. That framework, not surprisingly, is taken mostly from Damian Conway's Object Oriented Perl http://www.manning.com/Conway/index.html, Manning Publications, 2000. The get_bag()
method was inspired by Jarkko Hietnaiemi's Set::Bag module and Daniel Berger's Set::Array module, both available on CPAN.
The Inspiration
I realized the usefulness of putting the list comparison code into a module while preparing an introductory level Perl course given at the New School University's Computer Instruction Center in April-May 2002. I was comparing lists left and right. When I found myself writing very similar functions in different scripts, I knew a module was lurking somewhere. Inspiration: ''Repeated Code is a Mistake'' http://www.perl.com/pub/a/2000/11/repair3.html -- a 2001 talk by Mark-Jason Dominus http://perl.plover.com/ to the New York Perlmongers http://ny.pm.org/. The first public presentation of this module took place at Perl Seminar New York http://groups.yahoo.com/group/perlsemny on May 21, 2002. Comments and suggestions have been provided by Glenn Maciag, Josh Rabinowitz, Terrence Brannon and Dave Cross.
If You Like List::Compare, You'll Love ...
While preparing this module for distribution via CPAN, I had occasion to study a number of other modules already available on CPAN. Each of these modules is more sophisticated than List::Compare -- which is not surprising since all that List::Compare originally aspired to do was to avoid typing Cookbook code repeatedly. Here is a brief description of the features of these modules.
Algorithm::Diff - Compute 'intelligent' differences between two files/lists ("/search.cpan.org/doc/NEDKONZ/Algorithm-Diff-1.15/lib/Algorithm/Dif f.pm" in http:)
Algorithm::Diff is a sophisticated module originally written by Mark-Jason Dominus and now maintained by Ned Konz. Think of the Unix
diff
utility and you're on the right track. Algorithm::Diff exports methods such asdiff
, which "computes the smallest set of additions and deletions necessary to turn the first sequence into the second, and returns a description of these changes." Algorithm::Diff is mainly concerned with the sequence of elements within two lists. It does not export functions for intersection, union, subset status, etc.Array::Compare - Perl extension for comparing arrays (http://search.cpan.org/doc/DAVECROSS/Array-Compare-1.03/Compare.pm)
Array::Compare, by Dave Cross, asks whether two arrays are the same or different by doing a
join
on each string with a separator character and comparing the resulting strings. Like List::Compare, it is an object-oriented module. A sophisticated feature of Array::Compare is that it allows the user to specify how 'whitespace' in an array (an element which is undefined, the empty string, or whitespace within an element) should be evaluated for purpose of determining equality or difference. It does not directly provide methods for intersection and union.List::Util - A selection of general-utility list subroutines ("/search.cpan.org/doc/GBARR/Scalar-List-Utils-1.0701/lib/List/Util. pm" in http:)
List::Util, by Graham Barr, exports a variety of simple, useful functions for operating on one list at a time. The
min
function returns the lowest numerical value in a list; themax
function returns the highest value; and so forth. List::Compare differs from List::Util in that it is object-oriented and that it works on two strings at a time rather than just one -- but it aims to be as simple and useful as List::Util.Lists::Util (http://search.cpan.org/doc/TBONE/List-Utils-0.01/Utils.pm), by Terrence Brannon, provides methods which extend List::Util's functionality.
Quantum::Superpositions (http://search.cpan.org/doc/DCONWAY/Quantum-Superpositions-1.03/lib/Quantum/Superpositions.pm), by Damian Conway, is useful if, in addition to comparing lists, you need to emulate quantum supercomputing as well. Not for the eigen-challenged.
Set::Scalar - basic set operations (http://search.cpan.org/doc/JHI/Set-Scalar-1.17/lib/Set/Scalar.pm)
Set::Bag - bag (multiset) class (http://search.cpan.org/doc/JHI/Set-Bag-1.007/lib/Set/Bag.pm)
Both of these modules are by Jarkko Hietaniemi <jhi@iki.fi>. Set::Scalar has methods to return the intersection, union, difference and symmetric difference of two sets, as well as methods to return items unique to a first set and complementary to it in a second set. It has methods for reporting considerably more variants on subset status than does List::Compare.
Set::Bag enables one to deal more flexibly with the situation in which one has more than one instance of an element in a list.
Set::Array - Arrays as objects with lots of handy methods (including set comparisons) and support for method chaining. (http://search.cpan.org/doc/DJBERG/Set-Array-0.08/Array.pm)
Set::Array, by Daniel Berger <djberg96@hotmail.com>, "aims to provide built-in methods for operations that people are always asking how to do,and which already exist in languages like Ruby." Among the many methods in this module are some for intersection, union, etc. To install Set::Array, you must first install the Want module, also available on CPAN.
To Do
Possible future lines of development include:
Benchmark the module against lists with vastly greater numbers of elements than the lists which the author encounters in his day job (< 10E3). Consider optimizations to save memory and time.
Extend module to do comparisons on more than two lists at a time.
AUTHOR
James E. Keenan (jkeen@concentric.net).
Creation date: May 20, 2002. Last modification date: June 20, 2002. Copyright (c) 2002 James E. Keenan. United States. All rights reserved. This is free software and may be distributed under the same terms as Perl itself.