NAME

TM::Index - Topic Maps, Generic Indexing support

SYNOPSIS

# this package only provides generic functionality
# see TM::Index::* for specific indices

DESCRIPTION

One performance bottleneck when using the TM package or any of its subclasses are the low-level query functions match_forall and match_exists. They are looking for assertions of a certain nature. Almost all high-level functions, and certainly TM::QL use these.

This package (actually more its subclasses) provides an indexing mechanism to speed up the match_* functions by caching some results in a very specific way. When an index is attached to a map, then it will intercept all queries going to these functions.

Open vs. Closed Index

There are two options:

open:

The default is to keep the index lazy. In this mode the index is empty at the start and it will learn more and more by its own. In this sense, the index lives under an open world assumption (hence the name), as the absence of information does not mean that there is no result.

closed:

A closed world index has to be populated to be useful. If a query is launched and the result is stored in the index, then it will be used, like for an open index. If no result in the index is found for a query, the empty result will be assumed.

Map Attachment

To activate an index, you have to attach it to a map. This is done at constructor time.

It is possible (not sure how useful it is) to have one particular index to be attached to several different maps. It is also possible to have several TM::Index::* indices attached to one map. They are then consulted in the sequence of attachments:

my $idx1 = new TM::Index::Whatever  ($tm);
my $idx2 = new TM::Index::Whatever2 ($tm);

If $idx1 cannot help, then $idx2 is tried.

NOTE: If you use several indices for the same map, then all of them MUST be declared as being open. If one of them were closed, it would give a definite answer and would make the machinery not look further into other indices. This implies that you will have to populate your index explicitly.

Hash Technology

The default implementation uses an in-memory hash, no further fancy. Optionally, you can provide your own hash object, also one which is tied to an DBM file, etc.

INTERFACE

Constructor

The only mandatory parameter for the constructor is the map for which this index should apply. The map must be an instance of TM or any of its subclasses, otherwise an exception is the consequence.

Optional parameters are

closed (default: 0)

This controls whether the index is operating under closed or open world assumptions. If it is specified to be closed the method populate will be triggered at the end of the constructor.

cache (default: {})

You optionally can pass in your own HASH reference.

Example:

my $idx = new TM::Index::Match ($tm)

NOTE: When the index object goes out of scope, the destructor will make the index detach itself from the map. Unfortunately, the exact moment when this happens is somehow undefined in Perl, so it is better to do this manually at the end.

Example:

{
 my $idx2 = new TM::Index::Match ($tm, closed => 1);
 ....
 } # destructor called and index detaches automatically, but only in theory

{
 my $idx2 = new TM::Index::Match ($tm, closed => 1);
 ....
 $idx2->detach; # better do things yourself
 }

Methods

attach

$idx->attach

This method attaches the index to the configured map. Normally you will not call this as the attachment is implicitly done at constructor time. The index itself is not destroyed; it is just deactivated to be used together with the map.

detach

$idx->detach

Makes the index detach safely from the map. The map is not harmed in this process.

discard

$idx->discard

This throws away the index content.

is_cached

$bool = $idx->is_cached ($key)

Given a key parameter, the cache is consulted whether it already has a result for this key. If the index is closed it will return the empty list (reference), if it has no result, otherwise it will give back undef.

do_cache

$idx->do_cache ($key, $list_ref)

Given a key and a list reference, it will store the list reference there in the cache.

SEE ALSO

TM, TM::Index

COPYRIGHT AND LICENSE

Copyright 200[6] by Robert Barta, <drrho@cpan.org>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.