NAME
Apache::Solr - Apache Solr (Lucene) extension
INHERITANCE
Apache::Solr is extended by
Apache::Solr::JSON
Apache::Solr::XML
SYNOPSIS
my $solr = Apache::Solr->new(...);
my $doc = Apache::Solr::Document->new(...);
my $r = $solr->addDocument($doc);
$r or die;
my $r = $solr->select(q => 'author:mark');
print $r->selected(0)->{doc}{author};
# based on Log::Report, hence
use Log::Report;
dispatcher SYSLOG => 'default'; # now all warnings/error to syslog
DESCRIPTION
Solr is a stand-alone full-text search-engine, with loads of features. The main component is Lucene. This module tries to provide a high level interface to access the data.
BE WARNED: this code is very new! Please help me improve this code, by sending bugs and suggesting improvements.
METHODS
Constructors
- Apache::Solr->new(OPTIONS)
-
The locations of the Solr server depends on the way the java environment is set-up. The URL is either an URI object or a string which can be instantiated as such.
-Option --Default agent <created internally> autocommit true core undef format 'XML' server_version <latest>
- agent => LWP::UserAgent object
-
Agent which implements the communication between this client and the Solr server. When you have multiple
Apache::Solr
objects in your program, you may want to share this agent, to share the connection.Do not forget to install LWP::protocol::https if you need to connect via https.
- autocommit => BOOLEAN
-
Commit all changes immediately unless specified differently.
- core => NAME
-
Sets the default core name for this client. When there is no core name specified, the core is selected by the server or already part of the URL.
You probably want a core dedicated for testing and one for the live environment.
- format => 'XML'|'JSON'
-
Communication format between client and server. You may also instantiate one of the extensions directly.
- server_version => VERSION
-
The latest version of the server software, currently 4.0.
Accessors
- $obj->agent()
-
Returns the LWP::UserAgent object which maintains the connection to the server.
- $obj->autocommit([BOOLEAN])
- $obj->core([CORE])
-
Returns the CORE, when not defined the default core as set by new(core). May return
undef
. - $obj->server([URI|STRING])
-
Returns the URI object which refers to the server base address. You need to clone() it before modifying. You may set a new value as STRING or
URI
object. - $obj->serverVersion()
-
Returns the specified version of the Solr server software (by default the latest). Treat this version as string, to avoid rounding errors.
Commands
Search
- $obj->select(PARAMETERS)
-
Find information in the document collection.
This method has a HUGE number of parameters. These values are passed in the uri of the http query to the solr server. See expandSelect() for all the simplifications offered here. Sets of there parameters may need configuration help in the server as well.
Updates
See http://wiki.apache.org/solr/UpdateXmlMessages. Missing are the atomic updates.
- $obj->addDocument(DOC|ARRAY, OPTIONS)
-
Add one or more documents (Apache::Solr::Document objects) to the Solr database on the server.
-Option --Default allowDups <false> commit <autocommit> commitWithin undef overwrite <true> overwriteCommitted <not allowDups> overwritePending <not allowDups>
- allowDups => BOOLEAN
-
[deprecated since Solr 1.1??] Use option
overwrite
. - commit => BOOLEAN
- commitWithin => SECONDS
-
[Since Solr 3.4] Automatically translated into 'commit' for older servers. Currently, the resolution is milli-seconds.
- overwrite => BOOLEAN
- overwriteCommitted => BOOLEAN
-
[deprecated since Solr 1.1??]
- overwritePending => BOOLEAN
-
[deprecated since Solr 1.1??]
- $obj->commit(OPTIONS)
-
-Option --Default expungeDeletes <false> softCommit <false> waitFlush <true> waitSearcher <true>
- $obj->delete(OPTIONS)
-
Remove one or more documents, based on id or query.
-Option --Default commit <autocommit> fromCommitted true fromPending true id undef query undef
- commit => BOOLEAN
-
When specified, it indicates whether to commit (update the indexes) after the last delete. By default the value of new(autocommit).
- fromCommitted => BOOLEAN
-
[deprecated since ?]
- fromPending => BOOLEAN
-
[deprecated since ?]
- id => ID|ARRAY-of-IDs
-
The expected content of the uniqueKey fields (usually named
id
) for the documents to be removed. - query => QUERY|ARRAY-of-QUERYs
- $obj->optimize(OPTIONS)
-
-Option --Default maxSegments 1 softCommit <false> waitFlush <true> waitSearcher <true>
- $obj->rollback()
-
[solr 1.4]
Queries
- $obj->queryTerms(TERMS)
-
Search for often used terms. See http://wiki.apache.org/solr/TermsComponent
TERMS are passed to expandTerms() before being used.
Be warned: The result is not sorted when XML communication is used, even when you explicitly request it.
example:
my $r = $self->queryTerms(fl => 'subject', limit => 100); if($r->success) { foreach my $hit ($r->terms('subject')) { my ($term, $count) = @$hit; print "term=$term, count=$count\n"; } } if(my $r = $self->queryTerms(fl => 'subject', limit => 100)) ...
Parameters
Many parameters are passed to the server. The syntax of the communication protocol is not optimal for the end-user: it is too verbose and depends on the Solr server version.
General rules:
you can leave-out the prefix
use underscore as alternative to replace dots: less quoting needed
boolean values in Perl will get translated into 'true' and 'false'
when an ARRAY (or LIST), the order of the parameters get preserved
- $obj->expandSelect(PAIRS)
-
facet http://wiki.apache.org/solr/SimpleFacetParameters
hl (highlight) http://wiki.apache.org/solr/HighlightingParameters
mtl http://wiki.apache.org/solr/MoreLikeThis
stats http://wiki.apache.org/solr/StatsComponent
group http://wiki.apache.org/solr/FieldCollapsing
example:
my @r = $solr->expandSelect ( q => 'inStock:true', rows => 10 , facet => {limit => -1, field => [qw/cat inStock/], mincount => 1} , f_cat_facet => {missing => 1} , hl => {} , f_cat_hl => {} , mlt => { fl => 'manu,cat', mindf => 1, mintf => 1 } , stats => { field => [ 'price', 'popularity' ] } , group => { query => 'price:[0 TO 99.99]', limit => 3 } ); # becomes (one line) ...?rows=10&q=inStock:true &facet=true&facet.limit=-1&facet.field=cat &f.cat.facet.missing=true&facet.mincount=1&facet.field=inStock &mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1 &stats=true&stats.field=price&stats.field=popularity &group=true&group.query=price:[0+TO+99.99]&group.limit=3
- $obj->expandTerms(PAIRS|ARRAY)
-
example:
my @t = $solr->expandTerms('terms.lower.incl' => 'true'); my @t = $solr->expandTerms([lower_incl => 1]); # same my $r = $self->queryTerms(fl => 'subject', limit => 100);
Helpers
- $obj->deprecated(MESSAGE)
- $obj->endpoint(ACTION, OPTIONS)
-
Compute the address to be called (for HTTP)
-Option--Default core new(core) params []
- $obj->ignored(MESSAGE)
DETAILS
Comparison with other implementations
Compared to WebService::Solr
WebService::Solr is a good module, with a lot of miles. The main differences is that Apache::Solr
has much more abstraction.
simplified parameter syntax, improving readibility
real Perl-level boolean parameters, not 'true' and 'false'
warnings for deprecated and ignored parameters
smart result object with built-in trace and timing
hidden paging of results
flexible logging framework
both-way XML or both-way JSON, not requests in XML and answers in JSON
access to plugings like terms
SEE ALSO
This module is part of Apache-Solr distribution version 0.90, built on December 03, 2012. Website: http://perl.overmeer.net
LICENSE
Copyrights 2012 by [Mark Overmeer]. For other contributors see ChangeLog.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 43:
Unknown directive: =required