NAME

DiaColloDB::Document - diachronic collocation db, source document (base class)

SYNOPSIS

##========================================================================
## PRELIMINARIES

use DiaColloDB::Document;

##========================================================================
## Constructors etc.

$doc = CLASS_OR_OBJECT->new(%args);

##========================================================================
## API: I/O

$bool = $doc->fromFile($filename_or_fh);
$label = $doc->label();

DESCRIPTION

DiaColloDB::Document provides an abstract base-class for corpus documents from which a DiaColloDB database can be created. Support for alternative corpus formats can be be added by implementing a DiaColloDB::Document subclass for each required format.

Globals & Constants

Variable: @ISA

DiaColloDB::Document inherits from DiaColloDB::Logger.

Constructors etc.

new
$doc = CLASS_OR_OBJECT->new(%args);

%args, object structure:

label  => $label,   ##-- document label (e.g. filename; optional)
date   =>$date,     ##-- year
tokens =>\@tokens,  ##-- tokens, including undef for eos
meta   =>\%meta,    ##-- document metadata (e.g. author, title, collection, ...)

Each token in @tokens is a HASH-ref {w=>$word,p=>$pos,l=>$lemma,...}, or undef for EOS

API: I/O

fromFile
$bool = $doc->fromFile($filename_or_fh);

parse tokens from $filename_or_fh

label
$label = $doc->label();

return a string label for $doc; default just returns "$doc".

AUTHOR

Bryan Jurish <moocow@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2015 by Bryan Jurish

This package is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.14.2 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

dcdb-create.per(1), dcdb-query.perl(1), dcdb-info.perl(1), dcdb-export.perl(1), dcdb-dump.perl(1), DiaColloDB(3pm), perl(1), ...