NAME
KSx::Simple - Basic search engine.
SYNOPSIS
First, build an index of your documents.
my
$index
= KSx::Simple->new(
path
=>
'/path/to/index/'
language
=>
'en'
,
);
while
(
my
(
$title
,
$content
) =
each
%source_docs
) {
$index
->add_doc({
title
=>
$title
,
content
=>
$content
,
});
}
Later, search the index.
my
$total_hits
=
$index
->search(
query
=>
$query_string
,
offset
=> 0,
num_wanted
=> 10,
);
"Total hits: $total_hits\n"
;
while
(
my
$hit
=
$index
->
next
) {
"$hit->{title}\n"
,
}
DESCRIPTION
KSx::Simple is a stripped-down interface for the KinoSearch search engine library.
METHODS
new
my
$index
= KSx::Simple->new(
path
=>
'/path/to/index/'
,
language
=>
'en'
,
);
Create a KSx::Simple object, which can be used for both indexing and searching. Two hash-style parameters are required.
path - Where the index directory should be located. If no index is found at the specified location, one will be created.
language - The language of the documents in your collection, indicated by a two-letter ISO code. 12 languages are supported:
|-----------------------|
| Language | ISO code |
|-----------------------|
| Danish | da |
| Dutch | nl |
| English | en |
| Finnish | fi |
| French | fr |
| German | de |
| Italian | it |
| Norwegian |
no
|
| Portuguese | pt |
| Spanish | es |
| Swedish | sv |
| Russian | ru |
|-----------------------|
add_doc
$index
->add_doc({
location
=>
$url
,
title
=>
$title
,
content
=>
$content
,
});
Add a document to the index. The document must be supplied as a hashref, with field names as keys and content as values.
search
my
$total_hits
=
$index
->search(
query
=>
$query_string
,
# required
offset
=> 40,
# default 0
num_wanted
=> 20,
# default 10
);
Search the index. Returns the total number of documents which match the query. (This number is unlikely to match num_wanted
.)
query - A search query string.
offset - The number of most-relevant hits to discard, typically used when "paging" through hits N at a time. Setting offset to 20 and num_wanted to 10 retrieves hits 21-30, assuming that 30 hits can be found.
num_wanted - The number of hits you would like to see after
offset
is taken into account.
BUGS
Not thread-safe.
COPYRIGHT AND LICENSE
Copyright 2007-2011 Marvin Humphrey
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.