NAME
Iterator::Flex::Manual::Authoring - How to write an iterator
VERSION
version 0.12
DESCRIPTION
Iterators are constructed by passing an attribute hash (call it %AttrHash
) to a factory, which uses it to construct an appropriate iterator class, instantiate it, and return it to the user.
First we'll create the hash, then figure out how to make it available to the factory.
The Attribute Hash
The attribute hash (whose contents are documented much greater detail in "Iterator Parameters" in Iterator::Flex::Manual::Overview) describes the iterator's capabilities and provides implementations.
The heart of Iterator::Flex iterators is the next
capability, which must be implemented as a closure. Other capabilities are optional and may be either closures or methods.
next
next
has two responsibilities:
return the next data element
signal exhaustion
It usually also ensures that the current
and previous
capabilities return the proper values. Because it is called most often, it should be as efficient as possible.
As mentioned above, next
must be implemented as a closure. It has to keep track of state on its own, as it may not be passed any.
To illustrate, here's the entry in %AttrHash
for the next
closure for Iterator::Flex::Array:
next => sub {
if ( $next == $len ) {
# if first time through, set current
$prev = $current
if ! $self->is_exhausted;
return $current = $self->signal_exhaustion;
}
$prev = $current;
$current = $next++;
return $arr->[$current];
},
The first thing to notice is that there are a number of closed over variables that are defined outside of the subroutine.
It's cheap to retain the state of an array (it's just an index), so we can easily keep track of $next
, $prev
, $current
, and provide the additional prev and current capabilities. We also keep track of the array, $arr
, and its length $len
.
Finally, there's $self
, which is a handle to the iterator's object. It's not used for any performance critical work.
These must all be properly initialized; more on that later.
Exhaustion Phase
The code is divided into two sections; the first deals with data exhaustion:
if ( $next == $len ) {
# if first time through, set prev
$prev = $current
if ! $self->is_exhausted;
return $current = $self->signal_exhaustion;
}
Every time the iterator is invoked, the exhaustion state is determined. If it is exhausted, the iterator can start using Iterator::Flex's exhaustion facilities.
Recall that an iterator may signal exhaustion by throwing an exception or returning a sentinel value. The iterator itself doesn't care; it just calls the signal_exhaustion
method, which will first set the is_exhausted
predicate and then either return a sentinel value or throw an exception (which the iterator should not catch). In the former case, the iterator should pass that sentinel value on to the caller.
Unlike in some iterator models, calling next
after the iterator is exhausted is always a defined operation, always resulting in the same behavior. next
should thus always call signal_exhaustion
when exhausted, even if the iterator has already signaled exhaustion.
Iteration Phase
The second part of the code takes care of returning the correct data and setting the iterator up for the succeeding call to next
. It also ensures that the current and prev capabilities will return the proper values:
$prev = $current;
$current = $next++;
return $arr->[$current];
Initialization Phase
Finally, we'll get to the iterator initialization phase, which may make more sense now that we've gone through the other phases. Recall that we are using closed over variables to keep track of state. That means our next
sub must be created for every iterator so it can close over the current set of lexical variables.
Our code should look something like this:
# initialize lexical variables here
...
%attrHash = (
next => sub { ... } # as above, closing over lexical variables
};
We need to initialize $next
, $prev
, $current
, $arr
, $len
, and $self
.
The first five are easy
# initialize lexical variables here
my $next = 0;
my $prev = undef;
my $current = undef;
my $arr = \@array ; # <-- this is passed in from the user "somehow"
my $len = @array;
Now, what about $self
? Why is it a closed over variable, rather than being passed as a parameter to the next
sub? The answer is that next
is not a method. Iterator::Flex allows it to be treated as one, e.g.
$iter->next
is valid, but for efficiency the iterator can be called directly as a subroutine, e.g.,
$iter->();
skipping the overhead of an object method call. In this case, there's no way to pass in $self
, so where does it come from and how is it initialized? The answer is the closed over variable $self
, and another entry in the attribute hash, _self
which contains a reference to $self
that the iterator factory will use to initialize $self
.
# initialize lexical variables here
...
my $self;
%attrHash = (
_self => \$self,
next => sub { ... } # as above, closing over lexical variables
};
Other capabilities
For completeness, here's are the rest of the capabilities, except for freeze
, which complicates things quite a bit, and which we'll get into later.
reset => sub { $prev = $current = undef; $next = 0; },
rewind => sub { $next = 0; },
prev => sub { return defined $prev ? $arr->[$prev] : undef; },
current => sub { return defined $current ? $arr->[$current] : undef; },
Wrapping up
At this point %AttrHash
is functionally complete. The only thing left unknown is the array to iterate over, which has to be kept variable, so wrapping the above code into a subroutine
sub configure ( $array ) {
# initialize lexical variables here
...
my %AttrHash = ( ... );
return \%AttrHash;
}
Passing the %AttrHash
to the factory
Now we're ready to use the %AttrHash
to construct an iterator. Iterators may be constructed on-the-fly, or may be formalized as classes.
A one-off iterator
This approach uses "construct_from_attrs" in Iterator::Flex::Factory to create an iterator object from a hash describing the iterator capabilities:
my @array = ( 1..100 );
my $AttrHash = construct( \@array );
$iter = Iterator::Flex::Factorye->construct_from_attrs( $AttrHash, \%opts );
In addition to %AttrHash
, construct_from_attrs
takes another options hash, which is where the exhaustion policy is set.
In this case, we can choose one of the following entries
exhaustion => 'throw';
On exhaustion, throw an exception object of class
Iterator::Flex::Failure::Exhausted
.exhaustion => [ return => $sentinel ];
On exhaustion, return the specified sentinel value.
The default is
exhaustion => [ return => undef ];
At this point $iter
is initialized and ready for use.
An iterator class
Creating a class requires a few steps more, and gives the following benefits:
A much cleaner interface, e.g.
$iter = Iterator::Flex::Array->new( \@array );
vs. the multi-liner above.
The ability to freeze and thaw the iterator
some of the construction costs can be moved from run time to compile time.
An iterator class must
subclass Iterator::Flex::Base;
provide two class methods,
new
andconstruct
; andregister its capabilities.
new
The new
method converts from the API most comfortable to your usage to the internal API used by Iterator::Flex::Base. By convention, the last argument should be reserved for a hashref containing general iterator arguments (such as the exhaustion
key). This hashref is documented in "new_from_attrs" in Iterator::Flex::Base.
The super class' constructor takes two arguments: a variable containing iterator specific data (state), and the above-mentioned general argument hash. The state variable can take any form, it is not interpreted by the Iterator::Flex
framework.
Here's the code for "new" in Iterator::Flex::Array:
sub new ( $class, $array, $pars={} ) {
$class->_throw( parameter => "argument must be an ARRAY reference" )
unless Ref::Util::is_arrayref( $array );
$class->SUPER::new( { array => $array }, $pars );
}
It's pretty simple. It saves the general options hash if present, stores the passed array (the state) in a hash, and passes both of them to the super class' constructor. ( A hash is used here because Iterator::Flex::Array can be serialized, and extra state is required to do so).
construct
The construct
class method's duty is to return a %AttrHash
. It's called as
$AttrHash = $class->construct( $state );
where $state
is the state variable passed to "new" in Iterator::Flex::Base. Unsurprisingly, it is remarkably similar to the construct
subroutine developed earlier.
There are a few differences:
The signature changes, as this is a class method, rather than a subroutine.
There are additional
%AttrHash
entries available:_roles
, which supports run-time enabling of capabilities andfreeze
, which supports serialization.Capabilities other than
next
can be implemented as actual class methods, rather than closures. This decreases the cost of creating iterators (because they only need to be compiled once, rather than for every instance of the iterator) but increases run time costs, as they cannot use closed over variables to access state information.
Registering Capabilities
Unlike when using "construct_from_attr" in Iterator::Flex::Factory, which helpfully looks at %AttrHash
to determine which capabilities are provided (albeit at run time), classes are encouraged to register their capabilities at compile time via the _add_roles
method. For the example iterator class, this would be done via
__PACKAGE__->_add_roles( qw[
State::Registry
Next::ClosedSelf
Rewind::Closure
Reset::Closure
Prev::Closure
Current::Closure
] );
(These are all accepted shorthand for roles in the Iterator::Flex::Role namespace.)
If capabilities must be added at run time, use the _roles
entry in %AttrHash
.
The specific roles used here are:
- Next::ClosedSelf
-
This indicates that the
next
capability uses a closed over$self
variable, and thatIterator::Flex
should use the_self
hash entry to initialize it. - State::Registry
-
This indicates that the exhaustion state should be stored in the central iterator Registry. Another implementation uses a closed over variable (and the role
State::Closure
). See "Exhaustion" in Iterator::Flex::Manual::Internals. - Reset::Closure
- Prev::Closure
- Current::Closure
- Rewind::Closure
-
These indicate that the named capability is present and implemented as a closure.
All together
package My::Array;
use strict; use warnings;
use parent 'Iterator::Flex::Base';
sub new {
my $class = shift;
my $gpar = Ref::Util::is_hashref( $_[-1] ) ? pop : {};
$class->_throw( parameter => "argument must be an ARRAY reference" )
unless Ref::Util::is_arrayref( $_[0] );
$class->SUPER::new( { array => $_[0] }, $gpar );
}
sub configure {
my ( $class, $state ) = @_;
# initialize lexical variables here
...
my $arr = $state->{array};
my %AttrHash = ( ... );
return \%AttrHash;
}
__PACKAGE__->_add_roles( qw[
State::Registry
Next::ClosedSelf
Rewind::Closure
Reset::Closure
Prev::Closure
Current::Closure
] );
1;
SUPPORT
Bugs
Please report any bugs or feature requests to bug-iterator-flex@rt.cpan.org or through the web interface at: https://rt.cpan.org/Public/Dist/Display.html?Name=Iterator-Flex
Source
Source is available at
https://gitlab.com/djerius/iterator-flex
and may be cloned from
https://gitlab.com/djerius/iterator-flex.git
SEE ALSO
Please see those modules/websites for more information related to this module.
AUTHOR
Diab Jerius <djerius@cpan.org>
COPYRIGHT AND LICENSE
This software is Copyright (c) 2018 by Smithsonian Astrophysical Observatory.
This is free software, licensed under:
The GNU General Public License, Version 3, June 2007