NAME
Statistics::SDT - Signal detection theory (SDT) measures of sensitivity and response-bias
SYNOPSIS
The following is based on example data from Stanislav & Todorov (1999), and Alexander (2006), with which the module's results agree.
use Statistics::SDT 0.033;
my $sdt = Statistics::SDT->new(
correction => 1,
precision_s => 2,
);
$sdt->init(
hits => 50,
signal_trials => 50, # or misses => 0,
false_alarms => 17,
noise_trials => 25, # or correct_rejections => 8
); # or init these into 'new' &/or update any of their values as 2nd arg. hashrefs in calling the following methods
printf("Hit rate = %s\n", $sdt->rate('h') ); # .99
printf("False-alarm rate = %s\n", $sdt->rate('f') ); # .68
printf("Miss rate = %s\n", $sdt->rate('m') ); # .00
printf("Correct-rej'n rate = %s\n", $sdt->rate('c') ); # .32
printf("Sensitivity d' = %s\n", $sdt->sens('d') ); # 1.86
printf("Sensitivity Ad' = %s\n", $sdt->sens('Ad') ); # 0.91
printf("Sensitivity A' = %s\n", $sdt->sens('A') ); # 0.82
printf("Bias beta = %s\n", $sdt->bias('b') ); # 0.07
printf("Bias logbeta = %s\n", $sdt->bias('log') ); # -2.60
printf("Bias c = %s\n", $sdt->bias('c') ); # -1.40
printf("Bias Griers B'' = %s\n", $sdt->bias('g') ); # -0.91
printf("Criterion k = %s\n", $sdt->crit() ); # -0.47
printf("Hit rate via d & c = %s\n", $sdt->dc2hr() ); # .99
printf("FAR via d & c = %s\n", $sdt->dc2far() ); # .68
printf("LogBeta via d & c = %s\n", $sdt->dc2logbeta() ); # -2.60
# If the number of alternatives is greater than 2, there are two method options:
printf("JAlex. d_fc = %.2f\n", $sdt->sensitivity('f' => {hr => .866, states => 3, correction => 0, method => 'alexander'})); # 2.00
printf("JSmith d_fc = %.2f\n", $sdt->sensitivity('f' => {hr => .866, states => 3, correction => 0, method => 'smith'})); # 2.05
DESCRIPTION
Signal Detection Theory (SDT) measures of sensitivity and response-bias, e.g., d', A', c.
KEY NAMED PARAMS
The following named parameters must be given as a hash or hash-reference: either to the new constructor method, init, or into each measure-function. To calculate the hit-rate, you need to feed the (i) count of hits and signal_trials, or (ii) the counts of hits and misses, or (iii) the count of signal_trials and misses. To calculate the false-alarm-rate, you need to feed (i) the count of false_alarms and noise_trials, or (ii) the count of false_alarms and correct_rejections, or (iii) the count of noise_trials and correct_rejections. Or you supply the hit-rate and false-alarm-rate. Or see dc2hr and dc2far if you already have the measures, and want to get back to the rates.
- hits
-
The number of hits.
- false_alarms
-
The number of false alarms.
- signal_trials
-
The number of signal trials. The hit-rate is derived by dividing the number of hits by the number of signal trials.
- noise_trials
-
The number of noise trials. The false-alarm-rate is derived by dividing the number of false-alarms by the number of noise trials.
- states
-
The number of response states, or "alternatives", "options", etc.. Default = 2 (for the classic signal-detection situation of discriminating between signal+noise and noise-only). If the number of alternatives is greater than 2, when calling sensitivity, Smith's (1982) estimation of d' is used (otherwise Alexander's) - see forced_choice.
- correction
-
Supply a rich-boolean to indicate whether or not to perform a correction on the number of hits and false-alarms when the hit-rate or false-alarm-rate equals 0 or 1 (due, e.g., to strong inducements against false-alarms, or easy discrimination between signals and noise). This is relevant to all functions that make use of the inverse phi function (all except aprime option with sensitivity and griers option with bias). As
ndtri
must die with an error if 0 or 1 is given to its evaluation, there is a default correction.If
correction
is set to zero, no correction is performed to calculation of rates. This should only be used when you are using (1) the parametric measures and are absoultely sure that the rates are not at the extremes of 0 and 1; or (2) you will only use the nonparametric algorithms (aprime and griers).If
correction
is set to 1, extreme rates (of 0 and 1) are replaced with the number of signal/noise trials, moderated by a value of 0.5 (specifically, where n = number of signal or noise trials: 0 is replaced with 0.5 / n; 1 is replaced with (n - 0.5) / n). This is the most common method of handling extreme rates (Stanislav and Todorov, 1999) but it might bias sensitivity measures and not be as satisfactory as the loglinear transformation applied to all hits and false-alarms, as follows.If
correction
is set to greater than 1, the loglinear transformation is applied, i.e., 0.5 is added to both the number of hits and false-alarms, and 1 is added to the number of signal and noise trials. This adjustment is made irrespective of the extremity of the rates themselves.To avoid errors thrown by the
ndtri
function, any values that equal 1 or 0 will be corrected by method (1), if nothing is defined as the value ofcorrection
. - precision_s
-
Precision (n decimal places) of any of the statistics. Default = 0, which actually means that you get all decimal bits possible.
- method
-
Method for estimating d' when number of states/alternatives is greater than 2. Default value is smith; otherwise alexander; see forced_choice for application and description of these methods.
- hr
-
The hit-rate. Instead of passing the number of hits and signal trials, give the hit-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").
- far
-
This is the false-alarm-rate. Instead of passing the number of false alarms and noise trials, give the false-alarm-rate directly - but, if doing so, ensure the rate does not equal zero or 1 in order to avoid errors thrown by the inverse-phi function (which will be given as "ndtri domain error").
METHODS
new
Creates the class object that holds the values of the parameters, as above, and accesses the following methods, without having to resubmit all the values.
As well as holding the values of the parameters submitted to it, the class-object returned by new
will hold two arguments, hr, the hit-rate, and far, the false-alarm-rate. You can supply the hit-rate and false-alarm-rate themselves, but ensure that they do not equal zero or 1 in order to avoid errors thrown by the inverse-phi function. The calculation of the hit-rate and false-alarm-rate by the module corrects for this limitation - see the notes on the correction
parameter, above.
init
$sdt->init(
hits => integer,
misses => ?integer,
false_alarms => integer,
correct_rejections => ?integer,
signal_trials => integer (>= hits), # or will be calculated from hits and misses
noise_trials => integer (>= false_alarms), # or will be calculated from false_alarms and correction_rejections
hr => probability 0 - 1,
far => probablity 0 - 1,
correction => 0|1|2 (default = 1),
states => integer >= 2 (default = 2),
precision_s => integer (default = 0),
method => undef|smith|alexander (default = undef)
)
Instead of sending the number of hits, signal-trials, etc., with every call to the measure-functions, or creating a new class object for every set of data, initialise the class object with these values, as named parameters, key => value pairs. This method is called by new in case you pass the values to it in construction. The hit-rates and false-alarm rates are always calculated anew from the hits and signal trials, and the false-alarms and noise trials, respectively; unless you send a value for one or the other, or both (as hr and far) in a call to init
.
Each init
replaces the values only of those attributes that you pass to it - any values set in previous init
s are retained for those attributes that you do not set in a call to init
. If this is not what you want, and you actually want everything reset, first use clear
Optionally, the method also initialises any values you give it for states, correction, precision_s and method. If you have already set these values, and you do not do so in another call to init
; the previous values will be retained.
clear
$sdt->clear()
Sets all attributes to undef: hits
, false_alarms
, signal_trials
, noise_trials
, hr
, far
, states
, correction
, and method
.
rate
$sdt->rate('hr|far|mr|crr') # scalar string to return the indicated rate
$sdt->rate(hr => 'prob.', far => 'prob.', mr => 'prob.', crr => 'prob.') # one or more key => value pairs to set the rate
$sdt->rate('h' => {signal_trials => integer, hits => integer}) # or misses instead of hits
$sdt->rate('f' => {noise_trials => integer, false_alarms => integer}) # or correct_rejections instead of false_alarms
$sdt->rate('m' => {signal_trials => integer, misses => integer}) # or hits instead of misses
$sdt->rate('c' => {noise_trials => integer, correct_rejections => integer}) # or false_alarms instead of correct_rejections
Generic method to get or set any rate.
To get a rate, pass only a string that indicates the rate: hit, false-alarm, miss, correct-rejection: only checks the first letter, so any passable abbreviation will do. The rate is returned to the precision indicated by the present value of precision_s, if anything.
To set a rate, either give the actual probability as key => value pairs, or send a hashref giving sufficient info to calculate the rate (if this has not already been sent to init or one of the measure-methods).
Also performs any required or requested corrections, depending on the present value of correction.
Unless the values of the rates are directly given, then they will be calculated from the presently sent counts and trial-numbers, or whatever has been cached of these values. For the hit-rate, there must be a value for hits
and signal_trials
, and for the false_alarm_rate, there must be a value for false_alarms
and noise_trials
. If these values are not sent, they will be taken from any prior value, unless this has been cleared or never existed - in which case expect a croak
.
sensitivity
$s = $sdt->sensitivity('dprime|dforcedchoice|darea|aprime') # based on values of the measure variables already inited or otherwise set
$s = $sdt->sensitivity('dprime' => { signal_trials => integer}) # update any of the measure variables
Alias: sens
, discriminability
Get one of the sensitivity measures, as indicated by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref. The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument.
- dprime
-
Returns the index of sensitivity, or discrimination, d' (d prime), found by subtracting the z-score that corresponds to the false-alarm rate (far) from the z-score that corresponds to the hit rate (hr):
d' = phi–1(hr) – phi–1(far)
-
In this way, sensitivity is measured in standard deviation units, larger positive values indicating greater sensitivity. If both the hit-rate and false-alarm-rate are either 0 or 1, then sensitivity returns 0. A value of 0 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise. Values less than 0 indicate a lack of sensitivity that might result from a consistent, state-specific "mix-up" or inhibition of responses.
If there are more than two states (not only signal and noise-plus-signal), then d' will be estimated by the following.
- forced_choice
-
An estimate of d' based on the percent correct in a forced-choice task with any number of alternatives. This method is automatically called via sensitivity if the value of
states
is greater than 2. Only for this condition is it not necessary to calculate the false-alarm rate; the hit-rate is formed, as usual, as the count of hits divided by signal_trials.At least a couple methods are available to estimate d' when states > 2; accordingly, there is the option - set either in init or sensitivity or otherwise - for
method
: its default value is smith (this is the method cited by Stanislav & Todorov (1999)); otherwise, you can use the more generally applicable alexander method:Smith (1982) method: satisfies "the 2% bound for all M [states] and all percentiles and, except for M = 3 or 4, satisfies a 1% error bound". Unlike the
alexander
method, the specific algorithm is dependent on the size of states.For n states < 12:
d' = KM.log( ( (n– 1).hr ) / ( 1 – hr ) )
-
where
KM = .86 – .085 . log(n – 1).
-
If n >= 12,
d' = A + B . phi–1(hr)
-
where
A = (–4 + sqrt(16 + 25 . log(n – 1))) / 3
-
and
B = sqrt( (log(n – 1) + 2) / (log(n – 1) + 1) )
-
Alexander (2006/1990) method: "gives values of d' with an error of less than 2% (mostly less than 1%) from those obtained by integration for the range d' = 0 (or 1% correct for n [states] > 1000) to 75% correct and an error of less than 4% up to 95% correct for n up to at least 10000, and slightly greater maximum errors for n = 100000. This approximation is comparable to the accuracy of Elliott's table (0.02 in proportion correct) but can be used for any n." (Elliott's table being that in Swets, 1964, pp. 682-683). The estimation is offered by:
d' = ( phi–1(hr) – phi–1(1/n) ) / An
-
where n is the number of states (or "alternatives", size of the "alphabet", etc.), and An is estimated by:
An = 1 / (1.93 + 4.75.log10(n) + .63.[log10(n)]2)
- aprime
-
Returns the nonparametric index of sensitivity, A'.
Ranges from 0 to 1. Values greater than 0.5 indicate positive discrimination (1 = perfect performance); values less than 0.5 indicate a failure of discrimination (perhaps due to consistent "mix-up" or inhibition of state-specific responses); and a value of 0.5 indicates no sensitivity to the presence of the signal, i.e., it cannot be discriminated from noise.
- adprime
-
Returns Ad', the area under the receiver-operator-characteristic (ROC) curve, equalling the proportion of correct responses for the task as a two-alternative forced-choice task.
If both the hit-rate and false-alarm-rate are either 0 or 1, then
sensitivity
with this argument returns 0.5.
bias
$b = $sdt->bias('likelihood|loglikelihood|decision|griers') # based on values of the measure variables already inited or otherwise set
$b = $sdt->bias('likelihood' => { signal_trials => integer}) # update any of the measure variables
Get one of the decision/response-bias measures, as indicated below, by the first argument string, optionally updating any of the measure variables and options with a subsequent hashref (as given by example for signal_trials
, above).
With a yes response indicating that the decision variable exceeds the criterion, and a no response indicating that the decision variable is less than the criterion, the measures indicate if there is a bias toward the yes response, and so a liberal (or low) criterion, or a bias toward the no response, and so a conservative (or high) criterion.
The measures are as follows, accessed by giving the name (or at least its first two letters) as the first argument to bias
.
- beta (or) likelihood_bias
-
Returns the beta measure of response bias, based on the ratio of the likelihood the decision variable obtains a certain value on signal trials, to the likelihood that it obtains the value on noise trials.
Values less than 1 indicate a bias toward the yes response, values greater than 1 indicate a bias toward the no response, and the value of 1 indicates no bias toward yes or no.
- log_likelihood_bias
-
Returns the natural logarithm of the likelihood bias, beta.
Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.
- c (or) decision_bias
-
Implements the c parametric measure of response bias. Ranges from -1 to +1, with deviations from zero, measured in standard deviation units, indicating the position of the decision criterion with respect to the neutral point where the signal and noise distributions cross over, there is no response bias, and c = 0.
Values less than 0 indicate a bias toward the yes response; values greater than 0 indicate a bias toward the no response; and a value of 0 indicates no response bias.
- griers_bias
-
Implements Griers B'' nonparametric measure of response bias.
Ranges from -1 to +1, with values less than 0 indicating a bias toward the yes response, values greater than 0 indicating a bias toward the no response, and a value of 0 indicating no response bias.
criterion
$sdt->criterion() # assume d' and c can be calculated from already inited param values
$sdt->criterion(d => float, c => float)
Alias: dc2k
, crit
Returns the value of the criterion for given values of sensitivity d' and bias c, viz.: k = d' / 2 + c.
dc2hr
$sdt->dc2hr() # assume d' and c can be calculated from already inited param values
$sdt->dc2hr(d => float, c => float)
Returns the hit-rate estimated from given values of sensitivity d' and bias c, viz.: hr = phi(d' / 2 - c).
dc2far
$sdt->dc2far() # assume d' and c can be calculated from already inited param values
$sdt->dc2far(d => float, c => float)
Returns the false-alarm-rate estimated from given values of sensitivity d' and bias c, viz.: far = phi(-d' / 2 - c).
dc2logbeta
$sdt->dc2logbeta() # assume d' and c can be calculated from already inited param values
$sdt->dc2logbeta(d => float, c => float)
Returns the log-likelihood (beta) bias estimated from given values of sensitivity d' and bias c, viz.: b = d' . c.
REFERENCES
Alexander, J. R. M. (2006). An approximation to d' for n-alternative forced choice. From http://eprints.utas.edu.au/475/.
Lee, M. D. (2008). BayesSDT: Software for Bayesian inference with signal detection theory. Behavior Research Methods, 40, 450-456.
Smith, J. E. K. (1982). Simple algorithms for M-alternative forced-choice calculations. Perception and Psychophysics, 31, 95-96.
Stanislaw, H., & Todorov, N. (1999). Calculation of signal detection theory measures. Behavior Research Methods, Instruments, and Computers, 31, 137-149.
Swets, J. A. (1964). Signal detection and recognition by human observers. New York, NY, US: Wiley.
SEE ALSO
Math::Cephes : The present module imports/depends upon the ndtr (phi) and ndtri (inverse phi) functions from this package.
Statistics::ROC : Receiver-operator characteristic curves.
LIMITATIONS/TODO
Expects descriptive counts, not raw observations, confidence ratings; this limits the measures that can be implemented: methods load
and unload
are reserved to implement handling of data lists.
Perl's params
modules do not seem to effect the required validation of parameters needed for each measure; the present work-around is obsessive-compulsive, while not exhaustive of all wayward possibilities, and requires optimisation but extension. It is presently quite possible to suffer an inelegant death should anything too unsual, or impoverished of details, be attempted in the life of the module.
REVISION HISTORY
See Changes file in installation dist.
AUTHOR/LICENSE
- Copyright (c) 2006-2009 Roderick Garton
-
rgarton AT cpan DOT org
This program is free software. It may be used, redistributed and/or modified under the same terms as Perl-5.6.1 (or later) (see http://www.perl.com/perl/misc/Artistic.html).
- Disclaimer
-
To the maximum extent permitted by applicable law, the author of this module disclaims all warranties, either express or implied, including but not limited to implied warranties of merchantability and fitness for a particular purpose, with regard to the software and the accompanying documentation.
7 POD Errors
The following errors were encountered while parsing the POD:
- Around line 327:
Expected text after =item, not a bullet
- Around line 345:
Expected text after =item, not a bullet
- Around line 351:
Expected text after =item, not a bullet
- Around line 357:
Expected text after =item, not a bullet
- Around line 363:
Expected text after =item, not a bullet
- Around line 369:
Expected text after =item, not a bullet
- Around line 375:
Expected text after =item, not a bullet