NAME

DiaColloDB::Profile::Diff - diachronic collocation db, diff profiles

SYNOPSIS

##========================================================================
## PRELIMINARIES

use DiaColloDB::Profile::Diff;

##========================================================================
## Constructors etc.

$prf   = $CLASS_OR_OBJECT->new(%args);
$dprf2 = $dprf->clone();

##========================================================================
## Basic Access

($prf1,$prf2) = $dprf->operands();
$bool = $dprf->empty();

##========================================================================
## I/O: JSON

$obj = $CLASS_OR_OBJECT->loadJsonData( $data,%opts);

##========================================================================
## I/O: Text

undef = $CLASS_OR_OBJECT->saveTextHeader($fh, hlabel=>$hlabel, titles=>\@titles);
$bool = $prf->saveTextFh($fh, %opts);

##========================================================================
## I/O: HTML

$bool = $prf->saveHtmlFile($filename_or_handle, %opts);

##========================================================================
## Compilation

$dprf = $dprf->populate();
$dprf = $dprf->compile($func,%opts);
$dprf = $dprf->uncompile();

##========================================================================
## Trimming

\@keys = $dprf->which(%opts);
$dprf  = $dprf->trim(%opts);

##========================================================================
## Stringification

$dprf = $dprf->stringify( $obj);

##========================================================================
## Binary operations

$dprf = $dprf->_add($dprf2,%opts);

DESCRIPTION

DiaColloDB::Profile::Diff is a DiaColloDB::Profile subclass class for representing low-level collocate frequency comparison data for a single date-slice as arising from the comparison of two DiaColloDB::Profile objects.

Globals & Constants

Variable: @ISA

DiaColloDB::Profile::Diff inherits from DiaColloDB::Profile.

Constructors etc.

new
$prf = $CLASS_OR_OBJECT->new(%args);
$prf = $CLASS_OR_OBJECT->new($prf1,$prf2,%args)

%args, object structure:

##-- DiaColloDB::Profile::Diff
prf1 => $prf1,     ##-- 1st operand
prf2 => $prf2,     ##-- 2nd operand
##-- DiaColloDB::Profile keys
label => $label,   ##-- string label (used by Multi; undef for none(default))
#N   => $N,         ##-- OVERRIDE:unused: total marginal relation frequency
#f1  => $f1,        ##-- OVERRIDE:unused: total marginal frequency of target word(s)
#f2  => \%f2,       ##-- OVERRIDE:unused: total marginal frequency of collocates: ($i2=>$f2, ...)
#f12 => \%f12,      ##-- OVERRIDE:unused: collocation frequencies, %f12 = ($i2=>$f12, ...)
##
eps => $eps,       ##-- smoothing constant (default=undef: no smoothing)
score => $func,    ##-- selected scoring function ('f12', 'mi', or 'ld')
mi => \%mi12,      ##-- DIFFERENCE: score: mutual information * logFreq a la Wortprofil; requires compile_mi()
ld => \%ld12,      ##-- DIFFERENCE: score: log-dice a la Wortprofil; requires compile_ld()
fm => \%fm12,      ##-- DIFFERENCE: score: frequency per million; requires compile_fm()
clone
$dprf2 = $dprf->clone();
$dprf2 = $dprf->clone($keep_compiled);

clones %$dprf; if $keep_score is true, compiled data is cloned too.

Basic Access

operands
($prf1,$prf2) = $dprf->operands();

get operand profiles.

empty
$bool = $dprf->empty();

returns true iff both operands are empty

I/O: JSON

loadJsonData
$obj = $CLASS_OR_OBJECT->loadJsonData( $data,%opts);

guts for loadJsonString(), loadJsonFile()

I/O: Text

See also DiaColloDB::Persistent.

saveTextHeader
undef = $CLASS_OR_OBJECT->saveTextHeader($fh, hlabel=>$hlabel, titles=>\@titles);

print column title header for text output.

saveTextFh
$bool = $prf->saveTextFh($fh, %opts);

save flat TAB-separated text, format:

Na Nb F1a F1b F2a F2b F12a F12b SCOREa SCOREb SCOREdiff LABEL ITEM2...

%opts:

label => $label,   ##-- override $prf->{label} (used by Profile::Multi), no tab-separators required
format => $fmt,    ##-- printf score formatting (default="%.4f")
header => $bool,   ##-- include header-row? (default=1)
hlabel => $hlabel, ##-- prefix header item-cells with $hlabel (used by Profile::MultiDiff)

I/O: HTML

saveHtmlFile
$bool = $prf->saveHtmlFile($filename_or_handle, %opts);

Save flat HTML table data with rows of the form

SCOREa SCOREb DIFF PREFIX? ITEM2...

%opts:

table  => $bool,     ##-- include <table>..</table> ? (default=1)
body   => $bool,     ##-- include <html><body>..</html></body> ? (default=1)
header => $bool,     ##-- include header-row? (default=1)
hlabel => $hlabel,   ##-- prefix header item-cells with $hlabel (used by Profile::Multi), no '<th>..</th>' required
label => $label,     ##-- prefix item-cells with $label (used by Profile::Multi), no '<td>..</td>' required
format => $fmt,      ##-- printf score formatting (default="%.4f")

Compilation

populate
$dprf = $dprf->populate();
$dprf = $dprf->populate($prf1,$prf2);

populates diff-profile by subtracting $prf2 scores from $prf1.

compile
$dprf = $dprf->compile($func,%opts);

compile for score-function $func, one of qw(f fm mi ld); default='f'.

uncompile
$dprf = $dprf->uncompile();

un-compiles all scores for $dprf

Trimming

trim
$dprf = $dprf->trim(%opts);

trims profile and operands; %opts:

kbest => $kbest,    ##-- retain only $kbest items (by score value)
kbesta => $kbesta,  ##-- retain only $kbest items (by score absolute value)
cutoff => $cutoff,  ##-- retain only items with $prf->{$prf->{score}}{$item} >= $cutoff
keep => $keep,      ##-- retain keys @$keep (ARRAY) or keys(%$keep) (HASH)
drop => $drop,      ##-- drop keys @$drop (ARRAY) or keys(%$drop) (HASH)

Stringification

stringify
$dprf = $dprf->stringify( $obj);
$dprf = $dprf->stringify(\@key2str)
$dprf = $dprf->stringify(\&key2str)
$dprf = $dprf->stringify(\%key2str)

stringifies profile and operands (destructive) via $obj->i2s($key2), $key2str->($i2) or $key2str->{$i2}.

Binary operations

_add
$dprf = $dprf->_add($dprf2,%opts);

adds $dprf2 operatnd frequency data to $dprf operands (destructive); implicitly un-compiles $dprf. %opts:

N  => $bool, ##-- whether to add N values (default:true)
f1 => $bool, ##-- whether to add f1 values (default:true)

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dcdb-create.per(1), dcdb-query.perl(1), dcdb-info.perl(1), dcdb-export.perl(1), dcdb-dump.perl(1), DiaColloDB(3pm), perl(1), ...