NAME

DiaColloDB::Relation::Unigrams - diachronic collocation db, profiling relation: native unigram index

ALIASES

DiaColloDB::Relation::Unigrams
DiaColloDB::Unigrams

SYNOPSIS

##========================================================================
## PRELIMINARIES

use DiaColloDB::Relation::Unigrams;

##========================================================================
## Constructors etc.

$ug = $CLASS_OR_OBJECT->new(%args);

##========================================================================
## API: disk usage

@files = $obj->diskFiles();

##========================================================================
## Relation API: create

$ug = $CLASS_OR_OBJECT->create($coldb,$tokdat_file,%opts);

##========================================================================
## Relation API: union

$ug = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);

##========================================================================
## Relation API: default: profiling

$prf = $ug->subprofile(\@xids, %opts);

##========================================================================
## Relation API: default: query info

\%qinfo = $rel->qinfo($coldb, %opts);

DESCRIPTION

DiaColloDB::Relation::Unigrams is a DiaColloDB::Relation subclass for native indices over attribute-tuple unigrams using the DiaColloDB::PackedFile API for low-level index data.

Globals & Constants

Variable: @ISA

DiaColloDB::Relation::Unigrams inherits from DiaColloDB::Relation and DiaColloDB::PackedFile.

Constructors etc.

new
$ug = $CLASS_OR_OBJECT->new(%args);

%args, object structure:

##-- PackedFile: user options
file     => $filename,   ##-- default: undef (none)
flags    => $flags,      ##-- fcntl flags or open-mode (default='r')
perms    => $perms,      ##-- creation permissions (default=(0666 &~umask))
reclen   => $reclen,     ##-- record-length in bytes: (default: guess from pack format if available)
packas   => $packas,     ##-- pack-format or array; see DiaColloDB::Utils::packFilterStore();  ##-- OVERRIDE default='N'
##
##-- PackedFile: filters
filter_fetch => $filter, ##-- DB_File-style filter for fetch
filter_store => $filter, ##-- DB_File-style filter for store
##
##-- PackedFile: low-level data
fh       => $fh,         ##-- underlying filehandle
##
##-- Unigrams: high-level data
N        => $N,          ##-- total frequency

API: disk usage

diskFiles
@files = $obj->diskFiles();

returns disk storage files, used by du() and timestamp()

Relation API: create

create
$ug = $CLASS_OR_OBJECT->create($coldb,$tokdat_file,%opts);

populates current database from $tokdat_file, a tt-style text file containing 1 token-id perl line with optional blank lines.

%opts: clobber %$ug, also:

size=>$size,  ##-- set initial size

Relation API: union

union
$ug = $CLASS_OR_OBJECT->union($coldb, \@pairs, %opts);

merge multiple unigram indices into new object. @pairs is an array of pairs ([$ug,\@xi2u],...) of unigram-objects $ug and tuple-id maps \@xi2u for $ug. implicitly flushes the new index.

%opts: clobber %$ug

Relation API: default: profiling

subprofile
$prf = $ug->subprofile(\@xids, %opts);

get frequency profile for @xids (index must be opened). %opts:

groupby => \&gbsub,  ##-- key-extractor $key2_or_undef = $gbsub->($i2)

Relation API: default: query info

qinfo
\%qinfo = $rel->qinfo($coldb, %opts);

get query-info hash for profile administrivia (ddc hit links) %opts: as for profile(), additionally:

qreqs => \@qreqs,      ##-- as returned by $coldb->parseRequest($opts{query})
gbreq => \%groupby,    ##-- as returned by $coldb->groupby($opts{groupby})

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015-2016 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

DiaColloDB::Relation(3pm), DiaColloDB::Relation::Cofreqs(3pm), DiaColloDB::Relation::TDF(3pm), DiaColloDB::Relation::DDC(3pm), DiaColloDB(3pm), perl(1), ...