NAME
Set::Toolkit - searchable, orderable, flexible sets of (almost) anything.
VERSION
Version 0.02
SYNOPSIS
The Set Toolkit intends to provide a broad, robust interface to sets of data. Largely inspired by Set::Object, a default set from the Set Toolkit should behave similarly enough to those created by Set::Object that interchanging between the two is fairly easy and intuitive.
In addition to the set functionality already available around the CPAN, the Set Toolkit provides the ability to perform fairly complex, chained searches against the set, ordered and unordered considerations, as well as the ability to enforce or relax a uniqueness constraint (enforced by default).
use Set::Toolkit;
$set = Set::Toolkit->new();
$set->insert(
'a',
4,
{a=>'abc', b=>123},
{a=>'abc', b=>456, c=>'foo'},
{a=>'abc', b=>456, c=>'bar'},
'',
{a=>'ghi', b=>789, c=>'bar'},
{
x => {
y => "hello",
z => "world",
},
},
);
die "we didn't add enough items!"
if ($set->size < 4);
### Find single elements.
$el1 = $set->find(a => 'ghi');
$el2 = $set->find(x => { y=>'hello' });
### Print "Hello, world!"
print "Hello, ", $el2->{x}->{z}, "!\n";
### Search for result sets.
### $resultset will contain:
### {a=>'abc', b=>456, c=>'foo'},
### {a=>'abc', b=>456, c=>'bar'},
$resultset => $set->search(a => 'abc')
->search(b => 456);
### $bar will be: {a=>'ghi', b=>789, c=>'bar'},
$bar = $set->search(a => 'abc')
->search(b => 456)
->find(c => 'bar');
### Get the elements in the order they were inserted. These are equivalent:
@ordered = $set->ordered_elements;
$set->is_ordered(1);
@ordered = $set->elements;
### Get the elements in hash-random order. These two are equivalent:
@unordered = $set->unordered_elements
$set->is_ordered(0);
@unordered = $set->elements;
DESCRIPTION
This module implements a set objects that can contain members of (almost) any type, and provides a number of attached helpers to allow set and element manipulation at a variety of levels. By "almost", I mean that it won't let you store undef
as a value, but not for a good reason: that's just how Set::Object did it, and I haven't had a chance to think about the pros and cons yet. Probably in the future it'll be a settable flag.
The set toolkit is largely inspired by the work done in Set::Object, but with some notable differences: this package ...
- ... provides for ordered sets
- ... is pure perl.
- ... is slower for the above reasons (and more!)
- ... provides mechanisms for searching set elements.
- ... does not flatten scalars to strings.
- ... probably some other stuff.
In general, take a look at Set::Object first to see if it will suit your needs. If not, give Set::Toolkit a spin.
By default, this package's sets are intended to be functionally identical to those created by Set::Object (or close to it). That is, without specifying differently, sets created from the Set::Toolkit will be an unordered collection of things without duplication.
EXPORT
None at this time.
FUNCTIONS
Construction
new
Creates a new set toolkit object. Right now it doesn't take parameters, because I have not codified how it should work.
Set manipulation
insert
Insert new elements into the set.
### Create a set object.
$set = Set::Toolkit->new();
### Insert two scalars, an array ref, and a hash ref.
$set->insert('a', 'b', [2,4], {some=>'object'});
Duplicate entries will be silently ignored when the set's is_unique constraint it set. (This behavior is likely to change in the future. What will probably happen later is the element will be added and masked. That will probably be a setting =)
remove
Removes elements from the set.
### Create a set object.
$set = Set::Toolkit->new();
### Insert two scalars, an array ref, and a hash ref; the set size will
### be 4.
$set->insert('a', 'b', [2,4], {some=>'object'});
### Remove the scalar 'b' from the set. The set size will be 3.
$set->remove('b');
Note that removing things removes all instances of it (this only really matters in non-unique sets).
Removing references might catch you off guard: though you can insert object literals, you can't remove them. That's because each time you create a new literal, you get a new reference. Consider:
### Create a set object.
$set = Set::Toolkit->new();
### Insert two literal hashrefs.
$set->insert({a => 1}, {a => 2});
### Remove a literal hashref. This will have no effect, because the two
### objects (inserted and removed) are I<different references>.
$set->remove({a => 1});
However, the following should work instead
### Create a set object.
$set = Set::Toolkit->new();
### Create our two hashes.
($hash_a, $hash_b) = ({a=>1}, {a=>2});
### Insert the two references.
$set->insert($hash_a, $hash_b);
### Remove a hash reference. This will work; it's the same reference as
### what was inserted.
$set->remove($hash_a);
Obviously the same applies for all references.
Set inspection
elements
Returns a list of the elements in the set. The content of the list is sensitive to the set context, defined by is_ordered, is_unique, and possibly other settings later.
ordered_elements
Returns a list of the elements in insertion order, regardless of whether the set thinks its ordered or unordered. This can be thought of as a temporary coercion of the set to ordered for the duration of the fetch, only.
unordered_elements
Returns a list of the elements in a random order, regardless of whether the set thinks its ordered or unordered. This can be thought of as a temporary coercion of the set to unordered for the duration of the fetch, only.
The random order of the set relies on perl's treatment of hash keys and values. We're using a hash under the hood.
size
Returns the size of the set. This is context sensitive:
$set = Set::Toolkit->new();
$set->is_unique(0);
$set->insert(qw(d e a d b e e f));
### Prints:
### The set size is 8!
### The set size is 5!
print 'The set size is ', $set->size, '!';
$set->is_unique(1);
print 'The set size is ', $set->size, '!';
Set introspection
is_ordered
Returns a boolean value depending on whether the set is currently considering itself as ordered or unordered. Also a setter to change the set's context.
is_unique
Returns a boolean value depending on whether the set is currently considering itself as unique or duplicable (with respect to its elements). Also a setter to change the set's context.
search and find
Searching allows you to find subsets of your current set that match certain criteria. Some effort has been made to make the syntax as simple as possible, though some complexity is present in order to provide some power.
Searches take one argument, a constraint, that can be specified in two primary ways:
Scalar searches
Specifying a constraint as a scalar value makes a very simple check against any scalar values contained in your set (and only such values). Thus, if you search for "b", you will get a subset of the parent set that contains one string "b" for each such occurrance in the super set.
Consider the following:
### Create a new set.
$set = Set::Toolkit->new();
### Insert some values.
$set->insert(qw(a b c d e));
### Do a search, and then a find.
### $resultset is now a set object with one entry: 'b'
$resultset = $set->search('b');
### $resultset is now an empty set object (because we didn't insert any
### strings "x").
$resultset = $set->('x');
For scalars, it probably won't generally be useful to use search. You'll probably want to use find() instead, which simply returns the value sought, rather than a set of matches:
### Using the set above, $match now contains 'b'.
my $match = $set->find('b');
However, there is a case in which you might want to use scalar searches: in sets that are not enforcing uniqueness.
### Turn off the uniqueness constraint.
$set->is_unique(0);
### Add some more letters.
$set->insert(qw(a c e g i j));
### Now do some searches:
### $resultset will contain <'c','c'>
$resultset->search('a');
This may be useful for counting occurrances, such as:
print "There are ", $set->search('a')->size, " occurances of 'a'.\n";
Property searches
On the other hand, searching by property values will probably be useful more often. Consider the following set:
### Create our set.
$works = Set::Toolkit->new();
### Insert some complex values:
$works->insert(
{ name => {first=>'Franz', last=>'Kafka'},
title => 'Metamorphosis',
date => '1915'},
{ name => {first=>'Ovid', last=>'unknown'},
title => 'Metamorphosis',
date => 'AD 8'},
{ name => {first=>'Homer', last=>undef},
title => 'The Iliad',
date => 'unknown'},
{ name => {first=>'Homer', last=>undef},
title => 'The Odyssey',
date => 'unknown'},
{ name => {first=>'Ted', last=>'Chiang'},
title => 'Understand',
date => '1991'},
{ name => {first=>'John', last=>'Calvin'},
title => 'Institutes of the Christian Religion',
date => '1541'},
);
We can perform an arbitrarily complex subsearch of these fields, as follows:
### $homeric_works is now a set object containing the same hash references
### as the superset, "works", but only those that matched the first name
### "Homer" and the last name B<undef>.
my $homeric_works = $authors->search({
name => {
first => 'Homer',
last => undef,
});
### We can get a specific work, "The Oddysey," for example, by a second
### search (or B<find>):
### $oddysey_works is now a set of one.
my $oddysey_works = $homeric_works->search(title=>'The Odyssey');
### We can get the instance (instead of a set) with a B<find>:
my $oddysey_work = $homeric_works->find(title=>'The Odyssey');
### Which we could have gotten more easily by issuing a B<find> on the
### original set:
my $oddysey_work = $works->find(title=>'The Odyssey');
Searches can also be chained, if that's desirable for any reason, and find can be included in the chain, as long as it is the last link.
Note that this is not a speed-optimized scan at this point (but it shouldn't be brutally slow in most cases).
### Get a resultset of one.
my $resultset = $works->search(name=>{first=>'Homer'})
->search(title=>'The Iliad');
And you can search against multiple values:
### Search against title and date to get Ovid's I<Metamorphosis> (yeah, I
### realize his was plural, but give me a break here =)
### Get the set.
my $resultset = $works->search(
title => 'Metamorphosis',
date => 'AD 8'
);
### Get the item.
my $result = $works->find(
title => 'Metamorphosis',
date => 'AD 8'
);
When should this module be used?
You might want to use this module if the following are generally true:
- You aren't desparate for speed.
- You want to be able to search (and subsearch!) your sets easily.
- You want ordered sets.
When shouldn't this module be used?
This module probably isn't right for you if you:
- Need it fast, fast, fast!
- You don't care about searching your sets.
- You don't care about ordering your sets.
In these are true, I would take a look at Set::Object instead.
NOTES
Set::Toolkit sets contain "things" or "members" or "elements". I've avoided saying "objects" because you can really store anything in these sets, from scalars, to objects, to references.
Set::Toolkit does not currently support "weak" sets as defined by Set::Object.
Because uniqueness is not enforced by keying into a hash, scalars are not flattened into strings and will not lose their magicks.
SPECIAL DISCLAIMER
This is the first module I've released. I'm open to constructive critiques, bug reports, patches, doc patches, requests for documentation clarification, and so forth. Be gentle =)
AUTHOR
Sir Robert Burbridge, <sirrobert at gmail.com>
BUGS
Please report any bugs or feature requests to bug-set-toolkit at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Set::Toolkit. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Set::Toolkit
RT: CPAN's request tracker
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
Search CPAN
ACKNOWLEDGEMENTS
Thanks to Jean-Louis Leroy and Sam Vilain, the developers/maintainers of Set::Object, for lots of concepts, etc. I'm not actually using any borrowed code under the hood, but I plan to in the future.
COPYRIGHT & LICENSE
Copyright 2010 Sir Robert Burbridge, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.