NAME
Text::Statistics::GreekAndCoptic - Performs statistical corpora analysis
VERSION
Version 0.05
SYNOPSIS
Text::Statistics::GreekAndCoptic creates a seven column CSV file output with one line each token per text given as input a corpus that files names follows ' 1 (1). txt', '1 (2). txt', ..., '1 (n).txt' or 1 \(([1-9]|[1-9][0-9]+)\)\.txt Columns stores statistical information: (1) number of word forms in document d; (2) number of tokens in d; (3) Id number of d, ie., n; (4) frequency of term t in d; (5) corpus frequency of t ; (6) document frequency of t (number of documents where t occurs at least once); (7) t, UTF8 latin coded token-string
Main output file name is '1 (n + 5).txt' and it is stored in the same directory as the corpus itself, toghether with residual files on each input file with .txu and .txv extensions.
Example:
use Text::Statistics::GreekAndCoptic;
&greekandcoptic("4"); #3 (4-1) texts will be analised.
EXPORT
&greekandcoptic();
AUTHOR
Rodrigo Panchiniak Fernandes, <fernandes at cpan.org>
BUGS
Please report any bugs or feature requests to bug-text-statistics-latin at rt.cpan.org
, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=Text-Statistics-GreekAndCoptic. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Text::Statistics::GreekAndCoptic
You can also look for information at:
AnnoCPAN: Annotated CPAN documentation
CPAN Ratings
http://cpanratings.perl.org/d/Text-Statistics-GreekAndCoptic
RT: CPAN's request tracker
http://rt.cpan.org/NoAuth/Bugs.html?Dist=Text-Statistics-GreekAndCoptic
Search CPAN
ACKNOWLEDGEMENTS
Alberto Manuel Brandão Simões
COPYRIGHT & LICENSE
Copyright 2007 Rodrigo Panchiniak Fernandes, all rights reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This code was written under CAPES BEX-09323-5
1 POD Error
The following errors were encountered while parsing the POD:
- Around line 372:
Non-ASCII character seen before =encoding in 'Brandão'. Assuming UTF-8