NAME
TM::Index - Topic Maps, Generic Indexing support
SYNOPSIS
# this package only provides generic functionality
# see TM::Index::* for specific indices
DESCRIPTION
One performance bottleneck when using the TM package or any of its subclasses are the low-level query functions match_forall
and match_exists
. They are looking for assertions of a certain nature. Almost all high-level functions, and certainly TM::QL use these.
This package (actually more its subclasses) provides an indexing mechanism to speed up the match_*
functions by caching some results in a very specific way. When an index is attached to a map, then it will intercept all queries going to these functions.
Open vs. Closed Index
There are two options:
open
:-
The default is to keep the index lazy. In this mode the index is empty at the start and it will learn more and more by its own. In this sense, the index lives under an open world assumption (hence the name), as the absence of information does not mean that there is no result.
closed
:-
A closed world index has to be populated to be useful. If a query is launched and the result is stored in the index, then it will be used, like for an open index. If no result in the index is found for a query, the empty result will be assumed.
Map Attachment
To activate an index, you have to attach it to a map. This is done at constructor time.
It is possible (not sure how useful it is) to have one particular index to be attached to several different maps. It is also possible to have several TM::Index::* indices attached to one map. They are then consulted in the sequence of attachments:
my $idx1 = new TM::Index::Whatever ($tm);
my $idx2 = new TM::Index::Whatever2 ($tm);
If $idx1
cannot help, then $idx2
is tried.
NOTE: If you use several indices for the same map, then all of them MUST be declared as being open. If one of them were closed, it would give a definite answer and would make the machinery not look further into other indices. This implies that you will have to populate your index explicitly.
Hash Technology
The default implementation uses an in-memory hash, no further fancy. Optionally, you can provide your own hash object, also one which is tied to an DBM file, etc.
INTERFACE
Constructor
The only mandatory parameter for the constructor is the map for which this index should apply. The map must be an instance of TM or any of its subclasses, otherwise an exception is the consequence.
Optional parameters are
closed
(default:0
)-
This controls whether the index is operating under closed or open world assumptions. If it is specified to be closed the method
populate
will be triggered at the end of the constructor. cache
(default:{}
)-
You optionally can pass in your own HASH reference.
Example:
my $idx = new TM::Index::Match ($tm)
NOTE: When the index object goes out of scope, the destructor will make the index detach itself from the map. Unfortunately, the exact moment when this happens is somehow undefined in Perl, so it is better to do this manually at the end.
Example:
{
my $idx2 = new TM::Index::Match ($tm, closed => 1);
....
} # destructor called and index detaches automatically, but only in theory
{
my $idx2 = new TM::Index::Match ($tm, closed => 1);
....
$idx2->detach; # better do things yourself
}
Methods
- attach
-
$idx->attach
This method attaches the index to the configured map. Normally you will not call this as the attachment is implicitly done at constructor time. The index itself is not destroyed; it is just deactivated to be used together with the map.
- detach
-
$idx->detach
Makes the index detach safely from the map. The map is not harmed in this process.
- discard
-
$idx->discard
This throws away the index content.
- is_cached
-
$bool = $idx->is_cached ($key)
Given a key parameter, the cache is consulted whether it already has a result for this key. If the index is closed it will return the empty list (reference), if it has no result, otherwise it will give back
undef
. - do_cache
-
$idx->do_cache ($key, $list_ref)
Given a key and a list reference, it will store the list reference there in the cache.
SEE ALSO
COPYRIGHT AND LICENSE
Copyright 200[6] by Robert Barta, <drrho@cpan.org>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.