NAME
Lingua::JA::TFWebIDF - TF*WebIDF calculator
SYNOPSIS
use Lingua::JA::TFWebIDF;
use feature qw/say/;
use Data::Dumper;
my $tfidf = Lingua::JA::TFWebIDF->new(
appid => $appid,
fetch_df => 1,
Furl_HTTP => { timeout => 3 },
);
say $tfidf->idf($word);
say $tfidf->df($word);
my %tf = (
'自然言語処理' => 9,
'自然言語' => 6,
'自然言語理解' => 4,
'処理' => 5,
'解析' => 4,
);
$tfidf->ng_word(\@ng_words);
say Dumper $tfidf->tfidf($text)->dump;
say Dumper $tfidf->tfidf(\%tf)->dump;
say Dumper $tfidf->tf($text)->dump;
for my $result (@{ $tfidf->tfidf(\%tf)->list(5) })
{
my ($word, $score) = each %{$result};
say "$word: $score";
}
for my $result (@{ $tfidf->tf($text)->list(5) })
{
my ($word, $frequency) = each %{$result};
say "$word: $frequency";
}
DESCRIPTION
Lingua::JA::TFWebIDF calculates TF*WebIDF scores.
Compared with Lingua::JA::TFIDF, this module has the following advantages.
- supports Tokyo Cabinet, Bing API, idf_type option, expires_in option and so on.
- tfidf function accepts \%tf. (This eases the use of other morphological analyzers.)
METHODS
new(%config)
See Lingua::JA::WebIDF.
tfidf( $text || \%tf )
Calculates TF*WebIDF score. If scalar value is set, MeCab separates the value into appropriate morphemes. If you want to use other morphological analyzers, you have to set a hash reference which contains terms and their TF scores.
tf($text)
Calculates TF score.
ng_word(\@ng_words)
Sets NG words.
idf($word)
See Lingua::JA::WebIDF.
df($word)
See Lingua::JA::WebIDF.
AUTHOR
pawa <pawapawa@cpan.org>
SEE ALSO
LICENSE
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.