NAME
Algorithm::AM::Result - Store results of an AM classification
VERSION
version 3.12
SYNOPSIS
use Algorithm::AM;
my $am = Algorithm::AM->new('finnverb', -commas => 'no');
my ($result) = $am->classify;
print @{ $result->winners };
print $result->statistical_summary;
DESCRIPTION
This package encapsulates all of the classification information generated by "classify" in Algorithm::AM, including the assigned class, score to each class, gang effects, analogical sets, and timing information. It also provides several methods for generating printable reports with this information.
Note that the words 'score' and 'point' are used here to represent whatever count is assigned by analogical modeling during classification. This can be either pointers or occurrences. For an explanation of this, see Algorithm::AM::algorithm.
All of the scores returned by the methods here are scalars with special PV and NV values. You should excercise caution when doing calculations with them. See Algorithm::AM::BigInt for more information.
REPORT METHODS
The methods below return human eye-friendly reports about the classification. The return value is a reference, so it must be dereferenced for printing like so:
print ${ $result->statistical_summary };
config_info
Returns a scalar (string) ref containing information about the configuration at the time of classification. Information from the following accessors is included:
exclude_nulls
given_excluded
cardinality
test_in_train
test_item
count_method
statistical_summary
Returns a scalar reference (string) containing a statistical summary of the classification results. The summary includes all possible predicted classes with their scores and percentage scores and the total score for all classes. Whether the predicted class is correct/incorrect/a tie of some sort is also included, if the test item had a known class.
analogical_set_summary
Returns a scalar reference (string) containing the analogical set, meaning all items that contributed to the predicted class, along with the amount contributed by each item (score and percentage overall). Items are ordered by appearance in the data set.
gang_summary
Returns a scalar reference (string) containing the gang effects on the final class prediction.
A single boolean parameter can be provided to turn on list printing, meaning gang items items are printed. This is false (off) by default.
CONFIGURATION INFORMATION
The following methods provide information about the configuration of AM at the time of classification.
exclude_nulls
Set to the value given by the same method of Algorithm::AM at the time of classification.
given_excluded
Set to the value given by the same method of Algorithm::AM at the time of classification.
cardinality
The number of features used during classification. If there were null feature values and "exclude_nulls" was set to true, then this number will be lower than the cardinality of the utilized data sets.
test_in_train
True if the test item was present among the training items.
test_item
Returns the item which was classified.
count_method
Returns either "linear" or "squared", indicating the setting used for computing analogical sets. See "linear" in Algorithm::AM.
training_set
Returns the data set which was the source of classification data.
RESULT DETAILS
The following methods provide information about the results of the classification.
result
If the class of the test item was known before classification, this returns "tie", "correct", or "incorrect", depending on the label assigned by the classification. Otherwise this returns undef
.
gang_effects
Return a hash describing gang effects. Gang effects are similar to analogical sets, but the total effects of entire subcontexts and supracontexts are also calculated and printed.
TODO: details, details! Maybe make a gang class to hold this structure.
analogical_set
The analogical set is the set of items from the training set that had some effect on the item classification. The analogical effect of an item in the analogical set is the score it contributed towards a classification matching its own class label.
This method returns the items in the analogical set along with their analogical effects, in the following structure:
{ 'item_id' => {'item' => item, 'score' => score}
item
above is the actual item object. The item_id is used so that the analogical effect of a particular item can be found quickly:
my $set = $result->analogical_set;
print 'the item's analogical effect was '
. $set->{$item->id}->score;
high_score
Returns the highest score assigned to any of the class labels.
scores
Returns a hash mapping all predicted classes to their scores.
scores_normalized
Returns a hash mapping all predicted classes to their score, divided by the total score for all classes. For example, if the "scores" method returns the following:
{'e' => 4, 'r' => 9}
then this method would return the following (values below are rounded):
{'e' => 0.3076923, 'r' => 0.6923077}
winners
Returns an array ref containing the classes which had the highest score. There is more than one only if there is a tie for the highest score.
is_tie
Returns true if more than one class was assigned the high score.
total_points
The sum total number of points assigned as a score to any contexts.
start_time
Returns the start time of the classification.
end_time
Returns the end time of the classification.
AUTHOR
Theron Stanford <shixilun@yahoo.com>, Nathan Glenn <garfieldnate@gmail.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2021 by Royal Skousen.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.