NAME
SWISH::API::Common - SWISH Document Indexing Made Easy
SYNOPSIS
use
SWISH::API::Common;
my
$swish
= SWISH::API::Common->new();
# Index all files in a directory and its subdirectories
$swish
->
index
(
"/usr/local/share/doc"
);
# After indexing once (it's persistent), fire up as many
# queries as you like:
# Search documents containing both "swish" and "install"
for
my
$hit
(
$swish
->search(
"swish AND install"
)) {
$hit
->path(),
"\n"
;
}
DESCRIPTION
SWISH::API::Common
offers an easy interface to the Swish index engine. While SWISH::API offers a complete API, SWISH::API::Common
focusses on ease of use.
THIS MODULE IS CURRENTLY UNDER DEVELOPMENT. THE API MIGHT CHANGE AT ANY TIME.
Currently, SWISH::API::Common
just allows for indexing documents in a single directory and any of its subdirectories. Also, don't run index() and search() in parallel yet.
INSTALLATION
SWISH::API::Common
requires SWISH::API
and the swish engine to be installed. Please download the latest release from
and untar it, type
./configure
make
make install
and then install SWISH::API which is contained in the distribution:
cd perl
perl Makefile.PL
make
make install
METHODS
- $sw = SWISH::API::Common->new()
-
Constructor. Takes many options, but the defaults are usually fine.
Available options and their defaults:
# Where SWISH::API::Common stores index files etc.
swish_adm_dir
"$ENV{HOME}/.swish-common"
# The path to swish-e, relative is OK
swish_exe
"swish-e"
# Swish index file
swish_idx_file
"$self->{swish_adm_dir}/default.idx"
# Swish configuration file
swish_cnf_file
"$self->{swish_adm_dir}/default.cnf"
# SWISH Stemming
swish_fuzzy_indexing_mode
=>
"Stemming_en"
# Maximum amount of data (in bytes) extracted
# from a single file
file_len_max 100_000
# Preserve every indexed file's atime
atime_preserve
- $sw->index($dir, ...)
-
Generate a new index of all text documents under directory
$dir
. One or more directories can be specified. - $sw->search("foo AND bar");
-
Searches the index, using the given search expression. Returns a list hits, which can be asked for their path:
# Search documents containing
# both "foo" and "bar"
for
my
$hit
(
$swish
->search(
"foo AND bar"
)) {
print
$hit
->path(),
"\n"
;
}
- index_remove
-
Permanently delete the current index.
TODO List
* More than one
index
directory
* Remove documents from
index
* Iterator
for
search hits
LEGALESE
Copyright 2005 by Mike Schilli, all rights reserved. This program is free software, you can redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
2005, Mike Schilli <cpan@perlmeister.com>