NAME
DiaColloDB::Relation::Cofreqs - diachronic collocation db, profiling relation: native fixed-window co-frequency index
ALIASES
SYNOPSIS
##========================================================================
## PRELIMINARIES
use DiaColloDB::Relation::Cofreqs;
##========================================================================
## Constructors etc.
$cof = $CLASS_OR_OBJECT->new(%args);
##========================================================================
## I/O: open/close
$cof_or_undef = $cof->open($base,$flags);
$cof_or_undef = $cof->close();
$bool = $cof->opened();
##========================================================================
## I/O: header
@keys = $cof->headerKeys();
$bool = $cof->loadHeaderData($hdr);
##========================================================================
## I/O: text
$cof  = $cof->loadTextFh($fh,%opts)
$bool = $cof->saveTextFh($fh,%opts);
##========================================================================
## Relation API: creation
$cof = $CLASS_OR_OBJECT->create($coldb,$tokdat_file,%opts);
$cof = CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);
##========================================================================
## Relation API: default
\%slice2prf = $cof->subprofile1(\@tids, \%opts);
\%slice2prf = $cof->subprofile2(\%slice2prf, \%opts);
\%slice2prf = $cof->subextend(\%slice2prf, \%opts);
\%qinfo = $rel->qinfo($coldb, %opts);DESCRIPTION
DiaColloDB::Relation::Cofreqs is a DiaColloDB::Relation subclass for native indices over collocation frequencies within a fixed-length window of context words using a pair of DiaColloDB::PackedFile objects for low-level index data.
Only simple queries expressed as a disjunction of single-term conditions (i.e. those queries which evaluate to a set of term-tuples) are supported. Likewise, only groupby conditions over literal indexed term-attributes are supported.
Globals & Constants
- Variable: @ISA
- 
DiaColloDB::Relation::Cofreqs inherits from DiaColloDB::Relation. 
- Variable: $WANT_XS
- 
Attempt to use optimized DiaColloDB::XS::CofUtils subroutines? Default: undef: use XS if available.
Constructors etc.
- new
- 
$cof = CLASS_OR_OBJECT->new(%args);%args, object structure: ##-- user options class => $class, ##-- optional, useful for debugging from header file base => $basename, ##-- file basename (default=undef:none); use files "${base}.dba1", "${base}.dba2", "${base}.dba3", "${base}.hdr" flags => $flags, ##-- fcntl flags or open-mode (default='r') perms => $perms, ##-- creation permissions (default=(0666 &~umask)) dmax => $dmax, ##-- maximum distance for co-occurrences (default=5) fmin => $fmin, ##-- minimum pair frequency (default=0) pack_i => $pack_i, ##-- pack-template for IDs (default='N') pack_f => $pack_f, ##-- pack-template for frequencies (default='N') pack_d => $pack_d, ##-- pack-tempalte for dates (default='n') keeptmp => $bool, ##-- keep temporary files? (default=false) logCompat => $level, ##-- log-level for compatibility warnings (default='warn') logXS => $level, ##-- log-level for XS/PP dispatch (default='trace') ## ##-- size info (after open() or load()) size1 => $size1, ##-- == $r1->size() size2 => $size2, ##-- == $r2->size() size3 => $size2, ##-- == $r3->size() ## ##-- low-level data r1 => $r1, ##-- pf: [$end2] @ $i1 : constant (logical index) r2 => $r2, ##-- pf: [$end3,$d1,$f1]* @ end2($i1-1)..(end2($i1+1)-1) : sorted by $d1 for each $i1 r3 => $r3, ##-- pf: [$i2,$f12]* @ end3($d1-1)..(end3($d1+1)-1) : sorted by $i2 for each ($i1,$d1) N => $N, ##-- sum($f1) version => $version, ##-- file version, for compatibility checks
- DESTROY
- 
Destructor implicitly calls close(). 
I/O: open/close
- open
- 
$cof_or_undef = $cof->open($base,$flags); $cof_or_undef = $cof->open($base) $cof_or_undef = $cof->open()Opens underlying index files. 
- close
- 
$cof_or_undef = $cof->close();Closes underlying index files. Implicitly calls flush() if index is opened for writing. 
- opened
- 
$bool = $cof->opened();Returns true iff index is opened. 
I/O: header
See also DiaColloDB::Persistent.
- headerKeys
- 
@keys = $cof->headerKeys();keys to save as header 
- loadHeaderData
- 
$bool = $cof->loadHeaderData($hdr);instantiates header data from $hdr; overrides DiaColloDB::Persistent implementation. 
I/O: text
- loadTextFh
- 
$cof = $cof->loadTextFh($fh,%opts)loads from text file as saved by saveTextFh(); lines of the form: N ##-- 1 field : N FREQ ID1 DATE ##-- 3 fields: un-collocated portion of f(ID1,DATE) FREQ ID1 DATE ID2 ##-- 4 fields: co-frequency pair (ID2 >= 0) FREQ ID1 DATE ID2 DATE2 ##-- 5 fields: redundant date (used by create(); DATE2 is ignored)- supports semi-sorted input: input fh must be sorted numerically by - ($i1,$d1), and all- $i2for each- ($i1,$d1)-pair must be adjacent (i.e. no intervening- $j1 != $i1)
- supports multiple lines for collocation-triples - ($i1,$d1,$i2)provided the above conditions hold
- supports loading of - $cof->{N}from single-value lines
- uses optimized DiaColloDB::XS::CofUtils::loadTextFhXS if available, otherwise pure-perl fallback - loadTextFhPP()in this package.
- %opts: clobber- %$cof
 
- loadTextFile_create
- 
$cof = $cof->loadTextFile_create($fh,%opts);backwards-compatible alias for loadTextFh(). 
- saveTextFh
- 
$bool = $cof->saveTextFh($fh,%opts);save to text filehandle with lines of the form: N ##-- 1 field : N FREQ ID1 DATE ##-- 3 fields: un-collocated portion of f(ID1,DATE) FREQ ID1 DATE ID2 ##-- 4 fields: co-frequency pair (ID2 >= 0)%opts: i2s => \&CODE, ##-- code-ref for formatting indices; called as $s=CODE($i) i2s1 => \&CODE, ##-- code-ref for formatting item1 indices (overrides 'i2s') i2s2 => \&CODE, ##-- code-ref for formatting item2 indices (overrides 'i2s')
Relation API: creation
- create
- 
$cof = $CLASS_OR_OBJECT->create($coldb,$tokdat_file,%opts);populates co-frequency index from $tokdat_file, a tt-style text file with lines of the form: TID DATE ##-- single token "\n" ##-- blank line ~ EOS (hard co-occurrence boundary)%opts: clobber%$cof.
- union
- 
$cof = CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);merge multiple co-frequency indices into new object. @pairsis an array of pairs([$argcof,\@ti2u],...)of co-frequency relations$argcofand tuple-id maps\@ti2ufor$argcof. implicitly flushes the new index.%opts: clobber %$cof 
Relation API: default
- subprofile1
- 
\%slice2prf = $ug->subprofile1(\@tids,\%opts);Get slice-wise co-frequency profile(s) for tuple-IDs @tids.$cofmust be opened. %opts: as for DiaColloDB::Relation::subprofile1().
- subprofile2
- 
\%slice2prf = $rel->subprofile2(\%slice2prf, %opts);Populate independent collocate frequencies in %slice2prfvalues.$cofmust be opened. %opts: as for DiaColloDB::Relation::subprofile2().
- subextend
- 
\%slice2prf = $rel->subextend(\%slice2prf,\%opts);Populate independent collocate frequencies in %slice2prfvalues; wraps subprofile2().
- qinfo
- 
\%qinfo = $rel->qinfo($coldb, %opts);get query-info hash for profile administrivia (ddc hit links). %opts: as for DiaColloDB::Relation::profile(), additionally: qreqs => \@qreqs, ##-- as returned by $coldb->parseRequest($opts{query}) gbreq => \%groupby, ##-- as returned by $coldb->groupby($opts{groupby})
AUTHOR
Bryan Jurish <moocow@cpan.org>
COPYRIGHT AND LICENSE
Copyright (C) 2015-2020 by Bryan Jurish
This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.
SEE ALSO
DiaColloDB::Relation(3pm), DiaColloDB::Relation::Unigrams(3pm), DiaColloDB::Relation::TDF(3pm), DiaColloDB::Relation::DDC(3pm), DiaColloDB(3pm), perl(1), ...