NAME
Sort::Key::Merger - Perl extension for merging sorted things
SYNOPSIS
use Sort::Key::Merger qw(keymerger);
sub line_key_value {
# $_[0] is available as a scratchpad that persist
# between calls for the same $_;
unless (defined $_[0]) {
# so we use it to cache the file handle when we
# open a file on the first read
open $_[0], "<", $_
or croak "unable to open $_";
}
# don't get confused by this while loop, it's only
# used to ignore empty lines
my $fh = $_[0];
local $_; # break $_ aliasing;
while (<$fh>) {
next if /^\s*$/;
chomp;
if (my ($key, $value) = /^(\S+)\s+(.*)$/) {
return ($key, $value)
}
warn "bad line $_"
}
# signals the end of the data by returning an
# empty list
()
}
# create a merger object:
my $merger = keymerger { line_key_value } @ARGV;
# sort and write the values:
my $value;
while (defined($value=$merger->())) {
print "value: $value\n"
}
DESCRIPTION
Sort::Key::Merger allows to merge presorted collections of things based on some (calculated) key.
EXPORT
None by default.
The functions described below can be exported requesting so explicitly, i.e.:
use Sort::Key::Merger qw(keymerger);
FUNCTIONS
- keymerger { generate_key_value_pair } @sources;
-
merges the (presorted) generated values sorted by their keys lexicographically.
Every item in
@sourceis aliased by $_ and then the user defined subroutinegenerate_key_value_paircalled. The result from that subroutine call should be a (key, value) pair. Keys are used to determine the order in which the values are sorted and returned.generate_key_value_paircan return an empty list to indicate that a source has become exhausted.The result from
keymergeris another subroutine that works as a generator. It can be called as:my $next = &$merger;or
my $next = $merger->();In scalar context it returns the next value or undef if all the sources have been exhausted. In list context it returns all the values remaining from the sources merged in a sorted list.
NOTE: an additional argument is passed to the
generate_key_value_paircallback in$_[0]. It is to be used as a scrachpad, its value is associated to the current source and will perdure between calls from the same generator, i.e.:my $merger = keymerger { # use $_[0] to cache an open file handler: $_[0] or open $_[0], '<', $_ or croak "unable to open $_"; my $fh = $_[0]; local $_; while (<$fh>) { chomp; return $_ => $_; } (); } ('/tmp/foo', '/tmp/bar');This function honours the
use localepragma. - nkeymerger { generate_key_value_pair } @sources
-
is like
keymergerbut compares the keys numerically.This function honours the
use integerpragma. - filekeymerger { generate_key } @files;
-
returns a merger subroutine that returns lines read from
@filessorted by the keys thatgenerate_keygenerates.@filescan contain file names or handles for already open files.generate_keyis called with the line just read on$_and has to return the sorting key for it. If its return value isundefthe line is ignored.The line can be modified inside
generate_keychanging$_, i.e.:my $merger = filekeymerger { chomp($_); # <-- here return undef if /^\s*$/; substr($_, -1, 10) } @ARGV;Finally,
$/can be changed from its default value to read the files in chunks other than lines.The return value from this function is a subroutine reference that on successive calls returns the sorted elements; or all elements in one go when called in list context, i.e.:
my $merger = filekeymerger { (split)[0] } @ARGV; my @sorted = $merger->();This function honours the
use localepragma. - nfilekeymerger { generate_key } @files;
-
is like
filekeymergerbut the keys are compared numerically.This function honours the
use integerpragma.
SEE ALSO
Sort::Key, locale, integer, perl core sort function.
AUTHOR
Salvador Fandiño, <sfandino@yahoo.com>
COPYRIGHT AND LICENSE
Copyright (C) 2005 by Salvador Fandiño.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.