cdflib/doc/README - metacpan.org












                                    DCDFLIB

               Library of C Routines for Cumulative Distribution
                 Functions, Inverses, and Other Parameters

                                  Version 1.1

                                (November, 1997)






                    Summary Documentation of Each Routine








                            Compiled and Written by:

                                 Barry W. Brown
                                  James Lovato
                                  Kathy Russell







                     Department of Biomathematics, Box 237
                     The University of Texas, M.D. Anderson Cancer Center
                     1515 Holcombe Boulevard
                     Houston, TX      77030


 This work was supported by grant CA-16672 from the National Cancer Institute.


                          SUMMARY OF DCDFLIB

This  library  contains routines  to compute  cumulative  distribution
functions, inverses, and    parameters  of the  distribution  for  the
following set of statistical distributions:

    (1) Beta
    (2) Binomial
    (3) Chi-square
    (4) Noncentral Chi-square
    (5) F
    (6) Noncentral F
    (7) Gamma
    (8) Negative Binomial
    (9) Normal
    (10) Poisson
    (11) Student's t
    (12) Noncentral t

Given values of all but one parameter of a distribution, the other is
computed. These calculations are done with C pointers to Doubles.

          -------------------- WARNINGS --------------------

The F and  Noncentral F distribution are  not necessarily monotone  in
either degree  of  freedom argument.  Consequently,  there  may be two
degree of freedom arguments that satisfy the specified condition.  An
arbitrary one of these will be found by the cdf routines.

The  amount of computation  required for  the noncentral chisquare and
noncentral F  distribution    is proportional  to  the  value  of  the
noncentrality   parameter.  Very large values  of   this parameter can
require  immense   numbers of   computation.  Consequently,  when  the
noncentrality parameter is to  be calculated, the upper limit searched
is 10,000.  For the noncentral t, the computation time is proportional
to the noncentrality parameter so the upper limit searched is 10000.

        -------------------- END WARNINGS --------------------


                 COMMENTS ON THE C VERSION OF DCDFLIB

The C version was obtained by converting the original  Fortran DCDFLIB
to C using PROMULA.FORTRAN  and performing  some hand  crafting of the
result.  Information on PROMULA.FORTRAN can be obtained from

                   PROMULA Development Corporation
                    3620 N. High Street, Suite 301
                         Columbus, Ohio 43214
                            (614) 263-5454

DCDFLIB.C was tested  using the xlc  compiler under AIX  3.1 on an IBM
RS/6000.  The code  was  also examined  with lint  on the same system.
DCDFLIB  was also successfully tested run using the gcc compiler  (see
below) on a Solbourne.

DCDFLIB.C  can  be obtained by anonymous  ftp  to odin.mda.uth.tmc.edu
(129.106.3.17) where it is available as
                        /pub/unix/dcdflib.c.tar.Z

The Fortran version of DCDFLIB is available as
                        /pub/unix/dcdflib.f.tar.Z
on the same machine.
^L




                                 CAVEAT

DCDFLIB.C is written in ANSI C and makes heavy use  of prototypes.  It
will not compile under old style (KR) C compilers (such as the default
Sun cc compiler).

I don't  recommend conversion to an  obsolete C dialect.  Instead, get
the  Free  Software Foundation's  excellent  ANSI C compiler,  gcc. It
compiles KR C as well as  ANSI C. A  version of gcc that  runs on many
varieties of Unix is available by anonymous ftp as
                        /pub/gnu/gcc-1.40.tar.Z
at prep.ai.mit.edu  (18.71.0.38).   A Vax version  is also  present on
/pub/gnu.  The compilers  are also available  on tape.  Write the Free
Software Foundation at:

                    Free Software Foundation, Inc.
                       675 Massachusetts Avenue
                         Cambridge, MA  02139
                        Phone: (617) 876-3296

A MSDOS port of gcc, performed by DJ Delorie is also available by ftp.

File location:

    host:      grape.ecs.clarkson.edu
    login:     ftp
    password:  send your e-mail address
    directory: ~ftp/pub/msdos/djgcc

File in .ZIP format - djgpp.zip - one 2.2M file, contains everything.

A version of DCDFLIB which compiles under old style C can be obtained
by anonymous ftp to odin.mda.uth.tmc.edu  (129.106.3.17)  where it is
available as
                        /pub/unix/dcdflib.kr.c.tar.Z


                            DOCUMENTATION

This  file  contains an  overview  of the library   and is the primary
documentation.

Other documentation  is  in  directory 'doc'  on  the  distribution as
character  (ASCII) files.  A summary  of all of the available routines
is contained in dcdflib.chs (chs is an abbreviation of 'cheat sheet').
The  'chs'  file will  probably  be the  primary reference.  The file,
dcdflib.fdoc,  contains  the comments for  each  routine intended  for
direct use.  The file, dcdflib.h, contains prototypes for each routine
intended for direct use.

                             INSTALLATION

Directory src contains the C source.  The files ipmpar.c and dcdflib.c
constitute DCDFLIB.  The file cdflib.h is included in dcdflib.c.

A  few  routines use   machine  dependent  constants.  Lists  of  such
constants for different machines are found in ipmpar.c.  Uncomment the
ones  appropriate to your  machine.  The distributed  version uses the
IEEE arithmetic that is used by  the IBM PC,  Macintosh, and most Unix
workstations.  If you need to change the distribution version you must
comment out the definitions for  IEEE arithmetic as  well as uncomment
the ones appropriate to your machine.

NOTE: dcdflib should be linked to the C math library.

NOTE: Ignore compiler warnings of the type "statement not reached".

                               SOURCES

The following   routines, written  by   others, are  incorporated into
DCDFLIB.

                          Beta Distribution

DiDinato, A.  R. and Morris, A.  H.   Algorithm 708: Significant Digit
Computation of the Incomplete Beta  Function Ratios.  ACM Trans. Math.
Softw. 18 (1993), 360-373.

                 Gamma Distribution and It's Inverse

DiDinato, A. R. and Morris, A.  H. Computation of the Incomplete Gamma
Function  Ratios and  their  Inverse.   ACM  Trans.  Math.   Softw. 12
(1986), 377-393.

                         Normal Distribution

Kennedy and  Gentle, Statistical Computing,  Marcel  Dekker, NY, 1980.
The rational function approximations  from pages 90-95 are used during
the calculation of the inverse normal.

Cody, W.D.  (1993).  "ALGORITHM  715:  SPECFUN  -  A Portabel  FORTRAN
Package   of  Special  Function   Routines   and  Test  Drivers",  acm
Transactions on Mathematical Software. 19, 22-32.  A slightly modified
version of Cody's function  anorm  is used for the cumultive normal.

                             Zero Finder

J.   C. P.   Bus and  T.  J.  Dekker.   Two Efficient  Algorithms with
Guaranteed Convergence  for Finding a  Zero of a Function.  ACM Trans.
Math. Softw. 4 (1975), 330.

We transliterated Algoritm R of this paper from Algol to Fortran.

                          General Reference

Abramowitz,  M. and Stegun,  I. A.  Handbook of Mathematical Functions
With  Formulas, Graphs,  and   Mathematical Tables.   (1964)  National
Bureau of Standards.

This book has been reprinted by Dover and others.


                              LEGALITIES

Code that appeared  in an    ACM  publication  is subject  to    their
algorithms policy:

     Submittal of  an  algorithm    for publication  in   one of   the  ACM
     Transactions implies that unrestricted use  of the algorithm within  a
     computer is permissible.   General permission  to copy and  distribute
     the algorithm without fee is granted provided that the copies  are not
     made  or   distributed for  direct   commercial  advantage.    The ACM
     copyright notice and the title of the publication and its date appear,
     and  notice is given that copying  is by permission of the Association
     for Computing Machinery.  To copy otherwise, or to republish, requires
     a fee and/or specific permission.

     Krogh, F.  Algorithms  Policy.  ACM  Tran.   Math.  Softw.   13(1987),
     183-186.

We place the DCDFLIB code that we have written in the public domain.  

                                 NO WARRANTY
     
     WE PROVIDE ABSOLUTELY  NO WARRANTY  OF ANY  KIND  EITHER  EXPRESSED OR
     IMPLIED,  INCLUDING BUT   NOT LIMITED TO,  THE  IMPLIED  WARRANTIES OF
     MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.  THE ENTIRE RISK
     AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS  WITH YOU.  SHOULD
     THIS PROGRAM PROVE  DEFECTIVE, YOU ASSUME  THE COST  OF  ALL NECESSARY
     SERVICING, REPAIR OR CORRECTION.
     
     IN NO  EVENT  SHALL THE UNIVERSITY  OF TEXAS OR  ANY  OF ITS COMPONENT
     INSTITUTIONS INCLUDING M. D.   ANDERSON HOSPITAL BE LIABLE  TO YOU FOR
     DAMAGES, INCLUDING ANY  LOST PROFITS, LOST MONIES,   OR OTHER SPECIAL,
     INCIDENTAL   OR  CONSEQUENTIAL DAMAGES   ARISING   OUT  OF  THE USE OR
     INABILITY TO USE (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA OR
     ITS ANALYSIS BEING  RENDERED INACCURATE OR  LOSSES SUSTAINED  BY THIRD
     PARTIES) THE PROGRAM.
     
     (Above NO WARRANTY modified from the GNU NO WARRANTY statement.)

                    HOW TO USE THE ROUTINES

The calling sequence for each routine is of the form:

   void cdf<name>(int *which,double *p,double *q,double *x,
               double *<parameters>,int *status,double *bound)

WHICH   and STATUS  are  pointers to  int  ,  all other arguments  are
pointers to double.

<name> is a one to  three character name identifying the distribution.
which  is an input integer value  that identifies what parameter value
is to be calculated from the values of the other parameters.

P is always the cdf evaluated at X, Q is always the compliment of the
cdf evaluated at X, i.e.  1-P, and X is always the value at which the
cdf  is evaluated.   The auxiliary parameters,  <parameters>,  of the
distribution differ by distribution.

If WHICH is 1, P  and Q are to be calculated, i.e., the cdf; if WHICH
is 2, X is to be calculated, i.e., the inverse cdf.  The value of one
auxiliary parameter in <parameters> can also be the value calculated.

STATUS returns 0 if the calculation completes correctly.

           --------------------WARNING--------------------

If STATUS is not 0, no meaningful answer is returned.

        -------------------- END WARNING --------------------

STATUS returns  -I if the I'th  input parameter was  not  in the legal
range (see below).  Parameters are counted  with which being the first
in these return values.

A STATUS  value of 1 indicates that  the desired answer was apparently
lower than the lower bound on the search interval.  A return code of 2
indicates that  the answer was  apparently higher than the upper bound
on the search interval.  A return code of 3 indicates that P and Q did
not sum to 1. Other positive codes are routine specific.

BOUND is not  set if status is returned  as 0.  If  STATUS is -I  then
BOUND is   the bound illegally  exceeded by  input  parameter I, where
WHICH  is  counted as 1,  P as 2,  Q as 3,  X as 4, etc.  If STATUS is 
returned as 1 or 2 then bound  is returned as the lower or upper bound
on the search interval respectively.


                                BOUNDS

Below are  the rules that we used  in determining bounds on quantities
to be  calculated.   Those who don't care   can find a summary  of the
bounds in  dcdflib.chs.   Input bounds  are  checked for  legality  of
input.  The search  range  is  the range   of values searched  for  an
answer.

                             Input Bounds

Bounds on input parameters are  checked by the  cdf* routines.   These
bounds were set according to the following rules.

P: If the  domain of the cdf (X) extends to  -infinity  then P must be
greater than 0 otherwise P must be greater than or equal to 0.  P must
always be less than or equal to 1.

Q: If the  domain of the cdf (X) extends to  +infinity  then Q must be
greater than 0 otherwise Q must be greater than or equal to 0.  Q must
always be less than or equal to 1.

Further, P and Q must sum to 1. The smaller of the two P and Q will be
used in calculations to increase accuracy

X:  If  the  domain is infinite  in   either the positive  or negative
direction, no check  is performed in  that direction.  If the left end
of the domain is 0, then X is checked to assure non-negativity.

DF, SD, etc.:  Some auxiliary parameters must  be positive. The lowest
input values accepted for these parameters is 1E-100.


                                Search Bounds

These are the  ranges searched for an  answer.   If the domain  of the
parameter in the cdf  is closed at  some  finite value, e.g., 0,  then
this value is the same endpoint of the search range.  If the domain is
open  at  some finite   endpoint (which only  occurs   for  0 --  some
parameters must be strictly positive) then  the endpoint is 1E-100. If
the  domain is infinite in either  direction then +/- 1E100 is used as
the endpoint of the search range.

                        HOW THE ROUTINES WORK

The cumulative  distribution   functions are computed  directly.   The
normal, gamma,  and  beta functions use the  code  from the references
cited.  Other  cdfs are calculated  by relating them  to one  of these
distributions.  For example, the  binomial and negative binomial  cdfs
can be converted  to a beta cdf.   This is how fractional observations
are handled.  The  formula from Abramowitz  and Stegun  for converting
the cdfs is cited  in the fdoc file.    (We think the formula  for the
negative binomial in A&S is wrong, but there is a correct one which we
used.)

The inverse normal and gamma are also taken  from the references.  For
all other parameters, a search is made for the value that provides the
desired P.  Initial  values are chosen crudely  for the search  (e.g.,
5).  If the domain  of the cdf for the  parameter being calculated  is
infinite, a step doubling strategy is  used to bound the desired value
then the  zero  finder is  employed  to refine the answer.    The zero
finder attempts to obtain the answer accurately to about eight decimal
places.
	Global
`s`	Focus search bar
`?`	Bring up this help dialog
	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)
	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse
	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)