NAME
Math::Vector::Real::kdTree - kd-Tree implementation on top of Math::Vector::Real
SYNOPSIS
use Math::Vector::Real::kdTree;
use Math::Vector::Real;
use Math::Vector::Real::Random;
my @v = map Math::Vector::Real->random_normal(4), 1..1000;
my $tree = Math::Vector::Real::kdTree->new(@v);
my $ix = $tree->find_nearest_vector(V(0, 0, 0, 0));
say "nearest vector is $ix, $v[$ix]";
DESCRIPTION
This module implements a kd-Tree data structure in Perl and common algorithms on top of it.
Methods
The following methods are provided:
- $t = Math::Vector::Real::kdTree->new(@points)
-
Creates a new kd-Tree containing the given points.
- $t2 = $t->clone
-
Creates a duplicate of the tree. The two trees will share internal read only data so this method is more efficient in terms of memory usage than others performing a deep copy.
- my $ix = $t->insert($p0, $p1, ...)
-
Inserts the given points into the kd-Tree.
Returns the index assigned to the first point inserted.
- $s = $t->size
-
Returns the number of points inside the tree.
- $p = $t->at($ix)
-
Returns the point at the given index inside the tree.
- $t->move($ix, $p)
-
Moves the point at index
$ix
to the new given position readjusting the tree structure accordingly. - ($ix, $d) = $t->find_nearest_vector($p, $max_d, @but_ix)
- ($ix, $d) = $t->find_nearest_vector($p, $max_d, \%but_ix)
-
Find the nearest vector for the given point
$p
and returns its index and the distance between the two points (in scalar context the index is returned).If
$max_d
is defined, the search is limited to the points within that distanceOptionally, a list of point indexes to be excluded from the search can be passed or, alternatively, a reference to a hash containing the indexes of the points to be excluded.
- @ix = $t->find_nearest_vector_all_internal
-
Returns the index of the nearest vector from the tree.
It is equivalent to the following code (though, it uses a better algorithm):
@ix = map { scalar $t->nearest_vector($t->at($_), undef, $_) } 0..($t->size - 1);
- $ix = $t->find_farthest_vector($p, $min_d, @but_ix)
-
Find the point from the tree farthest from the given
$p
.The optional argument
$min_d
specifies a minimal distance. Undef is returned when not point farthest that it is found.@but_ix
specifies points that should not be considered when looking for the farthest point. - $ix = $t->find_farthest_vector_internal($ix, $min_d, @but_ix)
-
Given the index of a point on the tree this method returns the index of the farthest vector also from the tree.
- @k = $t->k_means_start($n)
-
This method uses the internal tree structure to generate a set of point that can be used as seeds for other
k_means
methods.There isn't any guarantee on the quality of the generated seeds, but the used algorithm seems to perform well in practice.
- @k = $t->k_means_step(@k)
-
Performs a step of the Lloyd's algorithm for k-means calculation.
- @k = $t->k_means_loop(@k)
-
Iterates until the Lloyd's algorithm converges and returns the final means.
- @ix = $t->k_means_assign(@k)
-
Returns for every point in the three the index of the cluster it belongs to.
- @ix = $t->find_in_ball($z, $d, $but)
- $n = $t->find_in_ball($z, $d, $but)
-
Finds the points inside the tree contained in the hypersphere with center
$z
and radius$d
.In scalar context returns the number of points found. In list context returns the indexes of the points.
If the extra argument
$but
is provided. The point with that index is ignored. - @ix = $t->ordered_by_proximity
-
Returns the indexes of the points in an ordered where is likely that the indexes of near vectors are also in near positions in the list.
k-means
The module can be used to calculate the k-means of a set of vectors as follows:
# inputs
my @v = ...; my $k = ...;
# k-mean calculation
my $t = Math::Vector::Real::kdTree->new(@v);
my @means = $t->k_means_start($k);
@means = $t->k_means_loop(@means);
@assign = $t->k_means_assign(@means);
my @cluster = map [], 1..$k;
for (0..$#assign) {
my $cluster_ix = $assign[$_];
my $cluster = $cluster[$cluster_ix];
push @$cluster, $t->at($_);
}
use Data::Dumper;
print Dumper \@cluster;
SEE ALSO
COPYRIGHT AND LICENSE
Copyright (C) 2011-2014 by Salvador Fandiño <sfandino@yahoo.com>
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.12.3 or, at your option, any later version of Perl 5 you may have available.