NAME
PDLDM::Rank - Calculates and finds tied ranks of a PDL data matrix
SYNOPSIS
use PDL;
use PDLDM::Rank qw(TiedRank EstimateTiedRank);
my $training_pdl = pdl ([[1,2,3,3,4,4,4,5,6,6], [1,1,1,2,2,4,4,5,6,6]]);
print "training data $training_pdl";
my ($ranked_training_pdl,$duplicates_training_pdl) = TiedRank($training_pdl);
print "ranked training data $ranked_training_pdl";
print "duplicate count in the training data $duplicates_training_pdl";
my $test_pdl = pdl ([[0.5,4,4.5,6.5], [0.2,1,2,2.5]]);
print "test data $test_pdl";
my ($ranked_test_pdl,$unique_test_pdl) = EstimateTiedRank($test_pdl,$training_pdl,$ranked_training_pdl);
print "ranked test data $ranked_test_pdl";
print "is the value unique? $unique_test_pdl";
DESCRIPTION
PDLDM::Rank finds the tied rank values of a given PDL. In the data PDL, the raws should represent the data instances and colomns should represent the attributes.
TiedRank
This returns two PDLs each with the same size as the imput PDL. The first variable contains the tied rank values. The second variable contains the number of instances that share the same value. TiedRank function should produce the same results as the MATLAB tiedrank function.
EstimateTiedRank
In some cases data are divided into two parts, training and testing (or evaluation). Tied ranks are first evaluated for the training data. It may be ineffient to re-evaluate the tied ranks of both training and testing data together.
EstimateTiedRank finds the lowest nearest rank for the test data. It needs three PDL inputs: test data, training data and tied ranks of the training data respectively. Tied ranks of the training data is the first variable retuned by the TiedRank function.
EstimateTiedRank returns two PDL varibles each of the same size as the test data PDL. The first varible contains the lowest nearest ranks from the tied ranks of the training data. The second variable contains whether the value is unique, ie. to be unique it should not exist in the training dataset in the corresponding attribute.
DEPENDENCIES
This module requires these other modules and libraries:
PDL
SEE ALSO
Please refer http://pdl.perl.org/ for PDL. PDL is very efficeint in terms of memory and execution time.
AUTHOR
Muthuthanthiri B Thilak L Fernando, <thilaklaksiri@yahoo.co.uk>
COPYRIGHT AND LICENSE
Copyright (C) 2015 by Muthuthanthiri B Thilak L Fernando
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.18.2 or, at your option, any later version of Perl 5 you may have available.