NAME

Statistics::RankCorrelation - Compute the rank correlation between two vectors

SYNOPSIS

use Statistics::RankCorrelation;

$c = Statistics::RankCorrelation->new(\@u, \@v);

$n = $c->spearman;
$n = $c->csim;

DESCRIPTION

This module computes the rank correlation coefficient between two sample vectors.

As an example, this metric is employed in the study of musical contour similarity and "sample agreement".

Okay. Some definitions are always in order:

Statistical rank: The ordinal number of a value in a list arranged in a specified order (usually decreasing).

PUBLIC METHODS

spearman

$n = $c->spearman;

Spearman rank-order correlation is a nonparametric measure of association based on the rank of the data values.

The Spearman correlation is a special case of the Pearson product-moment correlation.

csim

$n = $c->csim;

Return the "contour similarity index measure", which is a single dimensional measure of the similarity between two vectors.

This returns a measure in the range [-1..1] and is computed using matrices of binary data representing "higher or lower" values in the original vectors.

Please consult the csim item under the SEE ALSO section.

PRIVATE FUNCTIONS

_rank

$u_ranks = _rank(\@u);

Return an array reference of the ordinal ranks of the given data.

In the case of a tie in the data (identical values) the rank numbers are averaged. An example will help:

data  = [1.0, 2.1, 3.2, 3.2, 3.2, 4.3]
ranks = [1, 2, 9.6/3, 9.6/3, 9.6/3, 4]

_pad_vectors

($u, $v) = _pad_vectors($u, $v);

Append zeros to either input vector for all values in the other that do not have a corresponding value. That is, "pad" the tail of the shorter vector with zero values.

_correlation_matrix

$matrix = _correlation_matrix($u);

Return the correlation matrix for a single vector.

This function builds a square, binary matrix that represents "higher or lower" value within the vector itself.

SEE ALSO

For the csim function:

http://www2.mdanderson.org/app/ilya/Publications/JNMRcontour.pdf

For the other functions:

http://mathworld.wolfram.com/SpearmanRankCorrelationCoefficient.html

http://faculty.vassar.edu/lowry/ch3b.html

http://www.pinkmonkey.com/studyguides/subjects/stats/chap6/s0606801.asp

http://fonsg3.let.uva.nl/Service/Statistics/RankCorrelation_coefficient.html

http://www.statsoftinc.com/textbook/stnonpar.html#correlations

http://software.biostat.washington.edu/~rossini/courses/intro-nonpar/text/Tied_Data.html#SECTION00427000000000000000

TO DO

Implement the tie averaging done in Spearman's R.

Make a comprehensive test suite with a data file for all functions to use.

Implement other rank correlation measures. Here is a nice survey:

http://jeff-lab.queensu.ca/stat/sas/sasman/sashtml/proc/zompmeth.htm

AUTHOR

Gene Boggs <gene@cpan.org>

COPYRIGHT AND LICENSE

Copyright 2003, Gene Boggs

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.