NAME
Statistics::Krippendorff - Calculate Krippendorff's alpha
VERSION
Version 0.01
SYNOPSIS
use experimental qw( signatures );
use Statistics::Krippendorff ();
my @units = ({coder1 => 1, coder2 => 1},
{coder1 => 2, coder2 => 2, coder3 => 1},
{coder2 => 3, coder3 => 2});
my $sk = 'Statistics::Krippendorff'->new(units => \@units);
my $alpha1 = $sk->alpha;
$sk->delta(\&Statistics::Krippendorff::delta_nominal); # Same as default.
my $alpha2 = $sk->alpha;
my $ski = 'Statistics::Krippendorff'->new(
units => [[1, 1], [2,2,1], [undef,3,2]],
delta => sub ($, $v0, $v1) { ($v0 - $v1) ** 2 });
my $alpha_interval = $ski->alpha;
METHODS
new
my $sk = 'Statistics::Krippendorff'->new(
units => \@units,
delta => \&Statistics::Krippendorff::delta_nominal);
The constructor. It accepts the following named arguments:
units
An array reference of units. All units of analysis must be of the same type, but there are two possible types they all can have:
Each unit is a hash reference of the form
{ coder1 => 'value1', coder3 => 'value2', ... }
Each unit is an array reference of the form
['value1', undef, 'value2']
where the coder is encoded by the position in the array, missing data are indicated by an
undef
.
In both the cases, there must be at least two values in each unit. If you want to validate this precondition, call is_valid
.
delta
An optional argument defaulting to delta_nominal. You can specify any function f($self, $v1, $v2)
that compares the two values $v1
and $v2
and returns their distance (a number between 0 and 1). Several common methods are predefined:
delta_nominal
Used for nominal data, i.e. labels with no ordering.
delta_ordinal
Used for numeric values that are ordered, but can't be used in mathematical operations, for example number of stars in a movie rating system (we don't say that the distance from one star to two stars is the same as the distance from three starts to four stars). See the implementation on why $self
is needed as a parameter to delta.
delta_interval
Used for numeric values that can be used in mathematical operations.
delta_ratio
Used for non-negative numeric values (think degrees Kelvin).
delta_jaccard
This can be used when coders can specify more than one value. Join the values with commas; Jaccard index then uses the formula intersection_size / union_size
. If you sort the values before joining them, the expected coincidence matrix is smaller and the algorithm runs faster, but the resulting coefficient should be the same.
alpha
my $alpha = $sk->alpha;
Returns Krippendorff's alpha.
delta
$sk->delta(sub($self, $v1, $v2) {});
The difference function used to calculate the alpha. You can specify it in the constructor (see above), but you can later change it so something else, too.
is_valid
print "OK" if $sk->is_valid;
Check that each unit has at least two responses. If you use a hash representation of a unit, the values must be always defined.
frequency
my $freq = $sk->frequency('val1');
Returns the frequency of the given value.
pairable_values
Returns the total number of all pairable values (i.e. the sum of all frequencies).
vals
Returns a sorted list of all the possible values.
AUTHOR
E. Choroba, <choroba at cpan.org>
BUGS
Please report any bugs or feature requests to https://github.com/choroba/statistics-krippendorff/issues, via e-mail to bug-statistics-krippendorff at rt.cpan.org
, or through the web interface at https://rt.cpan.org/NoAuth/ReportBug.html?Queue=Statistics-Krippendorff. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
SUPPORT
You can find documentation for this module with the perldoc command.
perldoc Statistics::Krippendorff
You can also look for information at:
GitHub (report bugs here)
CPAN Ratings
Search CPAN
RT: CPAN's request tracker (you can report bugs here, too)
https://rt.cpan.org/NoAuth/Bugs.html?Dist=Statistics-Krippendorff
ACKNOWLEDGEMENTS
Implementation inspired by Wikipedia, additional tests taken from https://www.infoamerica.org/documentos_pdf/kripen.pdf.
LICENSE AND COPYRIGHT
This software is Copyright (c) 2025 by E. Choroba.
This is free software, licensed under:
The Artistic License 2.0 (GPL Compatible)