NAME
PDL::NDBin::Actions_PP - XS actions for PDL::NDBin
DESCRIPTION
This module contains the implementation of the core loop of the action classes in the PDL::NDBin::Action namespace. The subroutines are not intended to be called by the user.
It's numerically well-behaved: out() is always of the order of magnitude of the values themselves, unlike the sum of the values, which grows very large as the number of elements grows large.
The subtraction in() - out() guarantees that the computation will be done in the correct type (i.e., double instead of the type of the input).
Requires only one temporary.
Requires only one pass over the data.
I used to give the output array type float+, but that creates more problems than it solves. So now, averages are always computed in type double. Besides being the default type in PDL and the 'natural' floating-point type in C, it also makes the implementation easier.], Code => ' loop(n) %{ PDL_IF_BAD(if ($ISBAD(idx()) || $ISBAD(in())) continue;,) register PDL_Indx j = $idx(); $out(m => j) += ( $in() - $out(m => j) ) / ++( $count(m => j) ); %} ', );
# _istddev_loop # pp_def ( '_istddev_loop', Pars => "in(n); indx idx(n); double [o] out(m); indx [o] count(m); double [o] avg(m)", OtherPars => "PDL_Indx msize => m", HandleBad => 1, Doc => q[Compute the standard deviation of the elements in each bin.
Note that we compute the sample standard deviation, not an estimate of the population standard deviation (which differs by a factor).
Credit for the algorithm goes to ashawley on commandlinefu.com;
awk '{ delta = $1 - avg; avg += delta / NR; mean2 += delta * ($1 - avg) } END { print sqrt(mean2 / NR) }'
This is a wonderful solution solving many of the problems with more naive implementations:
It's numerically well-behaved.
The subtractions guarantee that the computations will be done in the correct type (i.e., double instead of the type of the input).
Requires only two temporaries (!).
Requires only one pass over the data.
I used to give the output array type float+, but that creates more problems than it solves. So now, standard deviations are always computed in type double. Besides being the default type in PDL and the 'natural' floating-point type in C, it also makes the implementation easier.], Code => ' loop(n) %{ PDL_IF_BAD(if ($ISBAD(idx()) || $ISBAD(in())) continue;,) register PDL_Indx j = $idx(); double delta = $in() - $avg(m => j); $avg(m => j) += delta / ++( $count(m => j) ); $out(m => j) += delta * ( $in() - $avg(m => j) ); %} ', );
# _istddev_post # pp_def( '_istddev_post', Pars => "in(); indx count(); [o] out()", HandleBad => 1, Inplace => [ 'in' ], Doc => 'Finalization for _istddev_loop().', Code => ' char anybad = 0; broadcastloop %{ if( ! $count() ) { anybad = 1; $SETBAD(out()); } else { $out() = sqrt( $in() / $count() ); } %} if (anybad) $PDLSTATESETBAD(out); ', );
pp_addpm( <<'EOD' );
AUTHOR
Edward Baudrez <ebaudrez@cpan.org>
COPYRIGHT AND LICENSE
This software is copyright (c) 2014 by Edward Baudrez.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.