NAME

AI::Classifier::Text::Analyzer - computing feature vectors from documents

VERSION

version 0.03

SYNOPSIS

use AI::Classifier::Text::Analyzer;

my $analyzer = AI::Classifier::Text::Analyzer->new();

my $features = $analyzer->analyze( 'aaaa http://www.example.com/bbb?xx=yy&bb=cc;dd=ff' );

DESCRIPTION

Computes feature vectors of text using some heuristics and adds words count (using Text::WordCounter by default).

The object is immutable - but some methods use a second parameter as an accumulator for the features found in given text.

It uses some specific values and methods that work for our case - but are not guaranteed to bring good results universally - see the source for details!

ATTRIBUTES

word_counter: Object with a word_count method that will calculate the frequency of words in a text document. By default Text::WordCounter.
global_feature_weight: The weight assigned for computed features of the text document. By default 2.

METHODS

new(word_counter => $foo, global_feature_weight => 3): Creates a new AI::Classifier::Text::Analyzer object. Both arguments are optional.
analyze($document, $features): Computes the feature vector of the given document and adds the initial vector of $features.
analyze_urls($document, $features): Computes a vector special url related features of a given text - currently there are used NO_URLS, MANY_URLS and REPEATED_URLS features.
filter($document): Removes html related parts from the text.

AUTHOR

Zbigniew Lukasiak <zlukasiak@opera.com>, Tadeusz Sośnierz <tsosnierz@opera.com>

COPYRIGHT AND LICENSE

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 146:: Non-ASCII character seen before =encoding in 'Sośnierz'. Assuming UTF-8

To install AI::Classifier::Text, copy and paste the appropriate command in to your terminal.

cpanm

cpanm AI::Classifier::Text

CPAN shell

perl -MCPAN -e shell
install AI::Classifier::Text

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)