NAME
DBD::SQLite::Cookbook - The DBD::SQLite Cookbook
DESCRIPTION
This is the DBD::SQLite cookbook.
It is intended to provide a place to keep a variety of functions and formals for use in callback APIs in DBD::SQLite.
Variance
This is a simple aggregate function which returns a variance. It is adapted from an example implementation in pysqlite.
package variance;
sub new { bless [], shift; }
sub step {
my ( $self, $value ) = @_;
push @$self, $value;
}
sub finalize {
my $self = $_[0];
my $n = @$self;
# Variance is NULL unless there is more than one row
return undef unless $n || $n == 1;
my $mu = 0;
foreach my $v ( @$self ) {
$mu += $v;
}
$mu /= $n;
my $sigma = 0;
foreach my $v ( @$self ) {
$sigma += ($x - $mu)**2;
}
$sigma = $sigma / ($n - 1);
return $sigma;
}
# NOTE: If you use an older DBI (< 1.608),
# use $dbh->func(..., "create_aggregate") instead.
$dbh->sqlite_create_aggregate( "variance", 1, 'variance' );
The function can then be used as:
SELECT group_name, variance(score)
FROM results
GROUP BY group_name;
Variance (Memory Efficient)
A more efficient variance function, optimized for memory usage at the expense of precision:
package variance2;
my $sum = 0;
my $count = 0;
my %hash;
sub new { bless [], shift; }
sub step {
my ( $self, $value ) = @_;
# by truncating and hashing, we can comsume many more data points
$value = int($value); # change depending on need for precision
# use sprintf for arbitrary fp precision
if (defined $hash{$value}) {
$hash{$value}++;
} else {
$hash{$value} = 1;
}
$sum += $value;
$count++;
}
sub finalize {
my $self = $_[0];
# Variance is NULL unless there is more than one row
return undef unless $count > 1;
# calculate avg
my $mu = $sum / $count;
my $sigma = 0;
foreach my $h (keys %hash) {
$sigma += (($h - $mu)**2) * $hash{$h};
}
$sigma = $sigma / ($count - 1);
return $sigma;
}
The function can then be used as:
SELECT group_name, variance2(score)
FROM results
GROUP BY group_name;
Variance (Highly Scalable)
A third variable implementation, designed for arbitrarily large data sets:
package variance;
my $mu = 0;
my $count = 0;
my $S = 0
sub new { bless [], shift; }
sub step {
my ( $self, $value ) = @_;
$count++;
$delta = $value - $mu;
$mu = $mu + $delta/$count
$S = $S + $delta*($value - $mu);
}
sub finalize {
my $self = $_[0];
return $S / ($count - 1);
}
The function can then be used as:
SELECT group_name, variance3(score)
FROM results
GROUP BY group_name;
SUPPORT
Bugs should be reported via the CPAN bug tracker at
http://rt.cpan.org/NoAuth/ReportBug.html?Queue=DBD-SQLite
TO DO
* Add more and varied cookbook recipes, until we have enough to turn them into a seperate CPAN distribution.
* Create a series of tests scripts that validate the cookbook recipies.
AUTHOR
Adam Kennedy <adamk@cpan.org>
COPYRIGHT
Copyright 2009 Adam Kennedy.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.