NAME
DiaColloDB::Relation - diachronic collocation db, relation API (abstract & utilities)
SYNOPSIS
##========================================================================
## PRELIMINARIES
use DiaColloDB::Relation;
##========================================================================
## Constructors etc.
$rel = $CLASS_OR_OBJECT->new(%args);
##========================================================================
## Relation API: create
$rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);
##========================================================================
## Relation API: union
$rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
##========================================================================
## Relation API: profile
$mprf = $rel->profile($coldb, %opts);
##========================================================================
## Relation API: comparison (diff)
$mpdiff = $rel->compare($coldb, %opts);
$mpdiff = $rel->diff($coldb, %opts);
##========================================================================
## Relation API: default
$prf = $rel->subprofile1(\@xids, %opts);
$slice2prf = $rel->subprofile2(\%slice2prf, %opts)
##========================================================================
## Relation API: default: qinfo()
\%qinfo = $rel->qinfo($coldb, %opts);
(\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);
DESCRIPTION
DiaColloDB::Relation is a base class for low-level indices capable of returning raw frequency data suitable for constructing DiaColloDB::Profile::Multi objects. In addition to the API specification, the DiaColloDB::Relation package also provides several common utility methods used by native DiaColloDB index types.
Globals & Constants
- Variable: @ISA
-
DiaColloDB::Relation inherits from DiaColloDB::Persistent.
Constructors etc.
- new
-
$rel = CLASS_OR_OBJECT->new(%args);
%args, object structure: nothing here, see subclass documentation for details.
Relation API: create
- create
-
$rel = $CLASS_OR_OBJECT->create($coldb, $tokdat_file, %opts);
populates relation database from $tokdat_file, a tt-style text file containing 1 token-id perl line with optional blank lines. %opts: clobber %$rel
Relation API: union
- union
-
$rel = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
merge multiple co-frequency indices into new object
@pairs : array of pairs ([$argrel,\@xi2u],...) of relation-objects $argrel and tuple-id maps \@xi2u for $rel
%opts: clobber %$rel
implicitly flushes the new index
Relation API: profile
- profile
-
$mprf = $rel->profile($coldb, %opts);
Get a relation-specific profile for selected items as a DiaColloDB::Profile::Multi object; called by DiaColloDB::profile().
%opts:
##-- selection parameters query => $query, ##-- target request ATTR:REQ... date => $date1, ##-- string or array or range "MIN-MAX" (inclusive) : default=all ## ##-- aggregation parameters slice => $slice, ##-- date slice (default=1, 0 for global profile) groupby => $groupby, ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method ## ##-- scoring and trimming parameters eps => $eps, ##-- smoothing constant (default=0) score => $func, ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f" kbest => $k, ##-- return only $k best collocates per date (slice) : default=-1:all cutoff => $cutoff, ##-- minimum score global => $bool, ##-- trim profiles globally (vs. locally for each date-slice?) (default=0) ## ##-- profiling and debugging parameters strings => $bool, ##-- do/don't stringify (default=do) fill => $bool, ##-- if true, returned multi-profile will have null profiles inserted for missing slices onepass => $bool, ##-- if true, use fast but incorrect 1-pass method (default=0; Cofreqs subclass only)
The default implementation calls $rel->subprofile1() for every requested date-slice, then calls $rel->subprofile2() to compute item2 frequencies, and finally collects the result in a DiaColloDB::Profile::Multi object.
Default values for %opts should be set by a higher-level call, e.g. DiaColloDB::profile().
Relation API: comparison (diff)
- compare
-
$mpdiff = $rel->compare($coldb, %opts);
Get a relation-specific comparison profile for selected items as a DiaColloDB::Profile::MultiDiff object.
%opts:
##-- selection parameters (a|b)?query => $query, ##-- target query as for parseRequest() (a|b)?date => $date1, ##-- string or array or range "MIN-MAX" (inclusive) : default=all ## ##-- aggregation parameters groupby => $groupby, ##-- string or array "ATTR1[:HAVING1] ...": default=$coldb->attrs; see groupby() method (a|b)?slice => $slice, ##-- date slice (default=1, 0 for global profile) ## ##-- scoring and trimming parameters eps => $eps, ##-- smoothing constant (default=0) score => $func, ##-- scoring function (f|fm|lf|lfm|mi|ld) : default="f" kbest => $k, ##-- return only $k best collocates per date (slice) : default=-1:all cutoff => $cutoff, ##-- minimum score global => $bool, ##-- trim profiles globally (vs. locally for each date-slice?) (default=0) diff => $diff, ##-- low-level score-diff operation (diff|adiff|sum|min|max|avg|havg); default='adiff' ## ##-- profiling and debugging parameters strings => $bool, ##-- do/don't stringify (default=do) onepass => $bool, ##-- if true, use fast but incorrect 1-pass profiling method (default=0) ## ##-- sublcass abstraction parameters _gbparse => $bool, ##-- if true (default), 'groupby' clause will be parsed only once, using $coldb->groupby() method _abkeys => \@abkeys, ##-- additional key-suffixes KEY s.t. (KEY=>VAL) gets passed to profile() calls if e.g. (aKEY=>VAL) is in %opts
The default implementation just wraps the profile() method; default values for %opts should be set by higher-level call, e.g. DiaColloDB::compare().
- diff
-
$mpdiff = $rel->diff($coldb, %opts);
alias for compare()
Relation API: default
- subprofile1
-
$prf = $rel->subprofile1(\@xids, %opts)
Native index API low-level pass-1 profiling function for joint frequency acquisition (f12); default implementation just throws an error.
- subprofile2
-
\%slice2prf = $rel->subprofile2(\%slice2prf, %opts)
Native index API low-level pass-2 profiling function for independent frequency acquisition (f2); default implementation just returns \%slice2prf, which is appropriate for relations which use a 1-pass strategy to populate
$prf->{f2}
in subprofile1().
Relation API: default: qinfo()
- qinfo
-
\%qinfo = $rel->qinfo($coldb, %opts);
get query-info hash for profile administrivia (ddc kwic links). %opts: as for profile(), additionally:
qreqs => \@areqs, ##-- as returned by $coldb->parseRequest($opts{query}) gbreq => \%groupby, ##-- as returned by $coldb->groupby($opts{groupby})
- qinfoData
-
(\@q1strs,\@q2strs,\@qxstrs,\@fstrs) = $rel->qinfoData($coldb,%opts);
parses @opts{qw(qreqs gbreq)} into conditions on w1, w2 and metadata filters (for ddc linkup). call this from subclass qinfo() methods.
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2015-2016 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.
SEE ALSO
DiaColloDB::Persistent(3pm), DiaColloDB::Relation::Cofreqs(3pm), DiaColloDB::Relation::Unigrams(3pm), DiaColloDB::Relation::TDF(3pm), DiaColloDB::Relation::DDC(3pm), DiaColloDB(3pm), perl(1), ...