NAME

Text::Statistics::Latin - performs corpora statistical analyses

SYNOPSIS

use CText::Statistics::Latin; 
&Text::Statistics::Latin:LATIN();

DESCRIPTION

Text::Statistics::Latin creates a seven column CSV file output with one line each token per text given as input a corpus that files names follows ' 1 (1). txt', '1 (2). txt', ..., '1 (n).txt' or 1 \(([1-9]|[1-9][0-9]+)\)\.txt Columns stores statistical information: (1) number of word forms in document d; (2) number of tokens in d; (3) Id number of d, ie., n; (4) frequency of term t in d; (5) corpus frequency of t ; (6) document frequency of t (number of documents where t occurs at least once); (7) t, UTF8 latin coded token-string

Main output file name is '1 (n + 5).txt' and it is stored in the same directory as the corpus itself, toghether with residual files on each input file with .txu and .txv extensions.

This code was written under CAPES BEX-09323-5

Methods

Example:

#!/usr/bin/perl use strict; use Text::Statistics::Latin;

&Text::Statistics::Latin::LATIN("5"); #4 files (5 - 1) are analysed.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 53:

=over is the last thing in the document?!