NAME
Plucene::Simple - An interface to Plucene
SYNOPSIS
use Plucene::Simple;
# create an index
my $plucy = Plucene::Simple->open($index_path);
# add to the index
$plucy->add(
$id1 => { $field => $term1 },
$id2 => { $field => $term2 },
);
# or ...
$plucy->index_document($id => $data);
# search an existing index
my $plucy = Plucene::Simple->open($index_path);
my @results = $plucy->search($search_string);
# optimize the index
$plucy->optimize;
# remove something from the index
$plucy->delete_document($id);
DESCRIPTION
This provides a simple interface to Plucene. Plucene is large and multi-featured, and it expected that users will subclass it, and tie all the pieces together to suit their own needs. Plucene::Simple is, therefore, just one way to use Plucene. It's not expected that it will do exactly what *you* want, but you can always use it as an example of how to build your own interface.
INDEXING
open
You make a new Plucene::Simple object like so:
my $plucy = Plucene::Simple->open($index_path);
If this index doesn't exist, then it will be created for you, otherwise you will be adding to an exisiting one.
Then you can add your documents to the index:
add
Every document must be indexed with a unique key (which will be returned from searches).
A document can be made up of many fields, which can be added as a hashref:
$plucy->add($key, \%data);
$plucy->add(
chap1 => {
title => "Moby-Dick",
author => "Herman Melville",
text => "Call me Ishmael ..."
},
chap2 => {
title => "Boo-Hoo",
author => "Lydia Lee",
text => "...",
}
);
index_document
Alternatively, if you do not want to index lots of metadata, but rather just simple text, you can use the index_document() method.
$plucy->index_document($key, $data);
$plucy->index_document(chap1 => 'Call me Ishmael ...');
delete_document
$plucy->delete_document($id);
optimize
$plucy->optimize;
Plucene is set-up to perform insertions quickly. After a bunch of inserts it is good to optimize() the index for better search speed.
SEARCHING
search
my @ids = $plucy->search('ishmael');
# ("chap1", ...)
This will return the IDs of each document matching the search term.
If you have indexed your documents with fields, you can also search with the field name as a prefix:
my @ids = $plucy->search("author:lee");
# ("chap2" ...)
my @results = $plucy->search($search_string);
This will search the index with the given query, and return a list of document ids.
Searches can be much more powerful than this - see Plucene for further details.
search_during
my @results = $lucy->search_during($search_string, $start_date, $end_date);
my @results = $lucy->search_during("to:Fred", "2001-01-01" => "2003-12-31");
If your documents were given an ISO 'date' field when indexing, search_during() will restrict the results to all documents between the specified dates. Any document without a 'date' field will be ignored.
COPYRIGHT
Copyright (C) 2003-2004 Kasei Limited