NAME

Search::Tools::HeatMap - locate the best matches in a snippet extract

SYNOPSIS

use Search::Tools::Tokenizer;
use Search::Tools::HeatMap;
    
my $tokens = $self->tokenizer->tokenize( $my_string, qr/^(interesting)$/ );
my $heatmap = Search::Tools::HeatMap->new(
    tokens      => $tokens,
    window_size => 20,
);

if ( $heatmap->has_spans ) {

    my $tokens_arr = $tokens->as_array;

    # stringify positions
    my @snips;
    for my $span ( @{ $heatmap->spans } ) {
        push( @snips, $span->{str} );
    }
    my $occur_index = $self->occur - 1;
    if ( $#snips > $occur_index ) {
        @snips = @snips[ 0 .. $occur_index ];
    }
    printf("%s\n", join( ' ... ', @snips ));
    
}

DESCRIPTION

Search::Tools::HeatMap implements a simple algorithm for locating the densest clusters of unique, hot terms in a TokenList.

HeatMap is used internally by Snipper but documented here in case someone wants to abuse and/or improve it.

METHODS

new( tokens => TokenList )

Create a new HeatMap. The TokenList object may be either a Search::Tools::TokenList or Search::Tools::TokenListPP object.

init

Builds the HeatMap object. Called internally by new().

window_size

The max width of a span. Defaults to 20 tokens, including the matches.

Set this in new(). Access it later if you need to, but the spans will have already been created by new().

spans

Returns an array ref of matching clusters. Each span in the array is a hash ref with the following keys:

cluster
pos
heat
str
str_w_pos
unique

has_spans

Returns the number of spans found.

AUTHOR

Peter Karman <karman at cpan dot org>

ACKNOWLEDGEMENTS

The idea of the HeatMap comes from KinoSearch, though the implementation here is original.

BUGS

Please report any bugs or feature requests to bug-search-tools at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Search-Tools. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc Search::Tools

You can also look for information at:

COPYRIGHT

Copyright 2009 by Peter Karman.

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

SEE ALSO

KinoSearch