NAME

Text::Same::FileChunkedSource

DESCRIPTION

Objects of this class represent a source of chunks (generally lines) in a file. The "chunks" could potentially be paragraphs or sentences.

SYNOPSIS

my $source = new Text::Same::FileChunkedSource(chunks->\@chunks)

METHODS

See below. Methods private to this module are prefixed by an underscore.

new

Title   : new
Usage   : $source = new Text::Same::FileChunkedSource(chunks->\@chunks)
Function: Creates a new ChunkedSource object from an array
Returns : A Text::Same::FileChunkedSource object
Args    : chunks - an array of strings

name

Title   : name
Usage   : my $name = $source->name();
Function: return the name of this source - generally the filename

get_all_chunks

Title   : get_all_chunks
Usage   : $all_chunks = $source->get_all_chunks;
Function: return (in order) the chunks from this source

get_chunk_by_indx

Title   : get_chunk_by_indx
Usage   : $chunk = $source->get_chunk_by_indx($indx);
Function: return the chunk/line at the given index in this source

get_all_chunks_count

Title   : get_all_chunks_count
Usage   : $count = $source->get_all_chunks_count;
Function: return the number of chunks in this source

get_filtered_chunk_indexes

Title   : get_filtered_chunk_indexes
Usage   : $filtered_chunk_indexes = $source->get_filtered_chunk_indexes($options);
Function: return (in order) the chunks from this source that match the given
          options:
           ignore_case=> (0 or 1)    -- ignore case when comparing
           ignore_blanks=> (0 or 1)  -- ignore blank lines when comparing
           ignore_space=> (0 or 1)   -- ignore whitespace in chunks

get_matching_chunk_indexes

Title   : get_matching_chunk_indexes
Usage   : $matches = $source->get_matching_chunk_indexes($options, $text);
Function: return (in order) the chunks from this source that match the given
          text.
          options:
           ignore_case=> (0 or 1)    -- ignore case when comparing
           ignore_blanks=> (0 or 1)  -- ignore blank lines when comparing
           ignore_space=> (0 or 1)   -- ignore whitespace in chunks

_get_filtered_indx_from_real

Title   : _get_filtered_indx_from_real
Usage   : $indx = $source->_get_filtered_indx_from_real($options, $real_indx);
Function: for the given index (eg. line number) in this source, return the
          corresponding index in the list of chunks generated by applying the
          $options.  For example if $options->{ignore_blanks} is true the
          filtered chunks will contain no blank lines.

eg. input lines:

  some text on line 0
  <blank line>
  <blank line>
  some text on line 3

the real index of "some text on line 3" is 3, but the filtered index is 1 if
ignore_blanks is set because the filtered lines are:
  some text on line 0
  some text on line 3

get_previous_chunk_indx

Title   : get_previous_chunk_indx
Usage   : $prev_chunk_indx =
              $source->get_previous_chunk_indx($options, $chunk_indx);
Function: return the previous chunk index from the list of filtered chunk
          indexes (for the given $options).  See discussion above.

get_next_chunk_indx

Title   : get_next_chunk_indx
Usage   : $next_chunk_indx =
              $source->get_next_chunk_indx($options, $chunk_indx);
Function: return the next chunk index from the list of filtered chunk
          indexes (for the given $options).  See discussion above.

AUTHOR

Kim Rutherford <kmr+same@xenu.org.uk>

COPYRIGHT & LICENSE

Copyright 2005,2006 Kim Rutherford. All rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

DISCLAIMER

This module is provided "as is" without warranty of any kind. It may redistributed under the same conditions as Perl itself.