NAME
Treex::Tool::Parser::MSTperl::ModelAdditional
VERSION
version 0.09407
DESCRIPTION
A model containing edge PMI, i.e. PMI[c,p] = log #[c,p] / #([c,*])#([*,p]) where c=child and p=parent
FIELDS
Public Fields
- model_file
-
The file containing the model, i.e. a TSV file in the format child[tab]parent[tab]PMI
- model_format
-
Currently only tsv is supported. TODO support tsv.gz, probably also Data Dumper model.
- buckets
-
(A reference to) an array of buckets that PMI is bucketed into (negative integers, do not have to be sorted). The PMI is first ceiled, and then it falls into the nearest lower bucket; (if there is no such bucket, falls into the lowest one).
Internal Fields
- model
-
In-memory representation of the model file, in the format model->{child}->{parent} = PMI.
- minBucket
-
The lowest bucket (a bin for all PMIs lower than that).
- maxBucket
-
The highest bucket (a bin for all PMIs higher than that).
- value2bucket
-
Provides fast conversion of ceiled PMIs that are between minBucket and maxBucket to buckets.
METHODS
- load
- get_value($child, $parent)
-
Returns the real PMI, i.e. a negative float (there are hundreds of thousands of possible values).
Returns '?' if PMI is unknown.
- get_rounded_value($child, $parent)
-
Returns ceiled PMI, i.e. the integer part of the real PMI (there are about 30 possible values).
Returns '?' if PMI is unknown.
- get_bucketed_value($child, $parent)
-
Returns the nearest bucket that is lower or equal to the ceiled value of the PMI, or the lowest existing bucket if the value is even lower.
Returns '?' if PMI is unknown.
AUTHORS
Rudolf Rosa <rosa@ufal.mff.cuni.cz>
COPYRIGHT AND LICENSE
Copyright © 2012 by Institute of Formal and Applied Linguistics, Charles University in Prague
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.