NAME
Search::Indexer::Incremental::MD5 - Incrementaly index your files
SYNOPSIS
use File::Find::Rule ;
use Readonly ;
Readonly my $DEFAUT_MAX_FILE_SIZE_INDEXING_THRESHOLD => 300 << 10 ; # 300KB
my $indexer
= Search::Indexer::Incremental::MD5::Indexer->new
(
USE_POSITIONS => 1,
INDEX_DIRECTORY => 'text_index',
get_perl_word_regex_and_stop_words(),
) ;
my @files = File::Find::Rule
->file()
->name( '*.pm', '*.pod' )
->size( "<=$DEFAUT_MAX_FILE_SIZE_INDEXING_THRESHOLD" )
->not_name(qr[auto | unicore | DateTime/TimeZone | DateTime/Locale])
->in('.') ;
indexer->add_files(@files) ;
indexer->add_files(@more_files) ;
indexer = undef ;
my $search_string = 'find_me' ;
my $searcher =
eval
{
Search::Indexer::Incremental::MD5::Searcher->new
(
USE_POSITIONS => 1,
INDEX_DIRECTORY => 'text_index',
get_perl_word_regex_and_stop_words(),
)
} or croak "No full text index found! $@\n" ;
my $results = $searcher->search($search_string) ;
# sort in decreasing score order
my @indexes = map { $_->[0] }
reverse
sort { $a->[1] <=> $b->[1] }
map { [$_, $results->[$_]{SCORE}] }
0 .. $#$results ;
for (@indexes)
{
print "$results->[$_]{PATH} [$results->[$_]{SCORE}].\n" ;
}
$searcher = undef ;
DESCRIPTION
This module implements an incrementatl text indexer and searcher based on Search::Indexer.
DOCUMENTATION
Given a list of files, this module will allow you to create an indexed text database that you can later query for matches. You can also use the siim command line application installed with this module.
SUBROUTINES/METHODS
delete_indexing_databases($index_directory)
Removes all the index databases from the passed directory
Arguments
$index_directory - location of the index databases
Returns - Nothing
Exceptions - Can't remove index databases.
get_file_MD5($file)
Returns the MD5 of the $file argument.
Arguments
Returns - A string containing the file md5
Exceptions - fails if the file can't be open
new( %named_arguments)
Create a Search::Indexer::Incremental::MD5::Indexer object.
my $indexer = new Search::Indexer::Incremental::MD5::Indexer(%named_arguments) ;
Arguments - %named_arguments
Returns - A Search::Indexer::Incremental::MD5::Indexer object
Exceptions -
Incomplete argument list
Error creating index directory
Error creating index metadata database
Error creating a Search::Indexer object
add_files(%named_arguments)
Adds the contents of the files passed as arguments to the index database. Files already indexed are checked and re-indexed only if their content has changed
Arguments %named_arguments
- FILES - Array reference - a list of files to add to the index
- DONE_ONE_FILE_CALLBACK - sub reference - called everytime a file is handled
Returns - Hash reference keyed on the file name
STATE - Boolean -
TIME - Float - re_indexing time
Exceptions
new( %named_arguments)
Create a Search::Indexer::Incremental::MD5::Searcher object.
my $indexer = new Search::Indexer::Incremental::MD5::Searcher(%named_arguments) ;
Arguments - %named_arguments
Returns - A Search::Indexer::Incremental::MD5::Searcher object
Exceptions -
Incomplete argument list
Error creating index directory
Error opening index metadata database
Error creating a Search::Indexer object
search(%named_arguments)
search for $search_string in the index database
Arguments %named_arguments
- SEARCH_STRING - Query string see Search::Indexer
Returns - Array reference - each entry contains
SCORE - the score obtained by the file when applying the query
PATH - the path to the file
MD5 - the file MD5 when the indexing was done
BUGS AND LIMITATIONS
None so far.
AUTHOR
Nadim ibn hamouda el Khemir
CPAN ID: NH
mailto: nadim@cpan.org
LICENSE AND COPYRIGHT
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Search::Indexer::Incremental::MD5
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
RT: CPAN's request tracker
Please report any bugs or feature requests to L <bug-search-indexer-incremental-md5@rt.cpan.org>.
We will be notified, and then you'll automatically be notified of progress on your bug as we make changes.
Search CPAN