NAME
AI::Classifier::Text - A convenient class for text classification
VERSION
version 0.03
SYNOPSIS
my
$cl
= AI::Classifier::Text->new(
classifier
=> AI::NaiveBayes->new(...));
my
$res
=
$cl
->classify(
"do cats eat bats?"
);
$res
=
$cl
->classify(
"do cats eat bats?"
, {
new_user
=> 1 });
$cl
->store(
'some-file'
);
# later
my
$cl
= AI::Classifier::Text->load(
'some-file'
);
my
$res
=
$cl
->classify(
"do cats eat bats?"
);
DESCRIPTION
AI::Classifier::Text combines a lexical analyzer (by default being AI::Classifier::Text::Analyzer) and a classifier (like AI::NaiveBayes) to perform text classification.
This is partially based on AI::TextCategorizer.
ATTRIBUTES
classifier
-
An object that'll perform classification of supplied feature vectors. Has to define a
classify()
method, which accepts a hash refence. The return value ofAI::Classifier::Text-
classify()> will be the return value ofclassifier
'sclassify()
method.This attribute has to be supplied to the
new()
method during object creation. analyzer
-
The class performing lexical analysis of the text in order to produce a feature vector. This defaults to
AI::Classifier::Text::Analyzer
.
METHODS
new(classifier => $foo)
-
Creates a new
AI::Classifier::Text
object. The classifier argument is mandatory. classify($document, $features)
-
Categorize the given document. A lexical analyzer will be used to extract features from
$document
, and in addition to that the features from$features
hash reference will be added. The return value comes directly from theclassifier
object'sclassify
method.
SEE ALSO
AI::NaiveBayes (3), AI::Categorizer(3)
AUTHOR
Zbigniew Lukasiak <zlukasiak@opera.com>, Tadeusz Sośnierz <tsosnierz@opera.com>
COPYRIGHT AND LICENSE
This software is copyright (c) 2012 by Opera Software ASA.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 114:
Non-ASCII character seen before =encoding in 'Sośnierz'. Assuming UTF-8