NAME
dbcolpercentile - compute percentiles or ranks for an existing column
SYNOPSIS
dbcolpercentile [-rplhS] column
DESCRIPTION
Compute a percentile of a column of numbers. The new column will be called percentile or rank. Non-numeric records are handled as in other programs.
If the data is pre-sorted and only a rank is requested, no extra storage is required. In all other cases, a full copy of data is buffered on disk.
OPTIONS
- -p or --percentile
-
Show percentile (default).
- -P or --rank or --nopercentile
-
Compute ranks instead of percentiles.
- --fraction
-
Show fraction (percentage, except between 0 and 1, not cumulatative fraction).
- -a or --include-non-numeric
-
Compute stats over all records (treat non-numeric records as zero rather than just ignoring them).
- -S or --pre-sorted
-
Assume data is already sorted. With one -S, we check and confirm this precondition. When repeated, we skip the check.
- -f FORMAT or --format FORMAT
-
Specify a printf(3)-style format for output statistics. Defaults to
%.5g
. - -T TmpDir
-
where to put tmp files. Also uses environment variable TMPDIR, if -T is not specified. Default is /tmp.
Sort specification options (can be interspersed with column names):
- -r or --descending
-
sort in reverse order (high to low)
- -R or --ascending
-
sort in normal order (low to high)
- -n or --numeric
-
sort numerically (default)
- -N or --lexical
-
sort lexicographically
This module also supports the standard fsdb options:
- -d
-
Enable debugging output.
- -i or --input InputSource
-
Read from InputSource, typically a file name, or
-
for standard input, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - -o or --output OutputDestination
-
Write to OutputDestination, typically a file name, or
-
for standard output, or (if in Perl) a IO::Handle, Fsdb::IO or Fsdb::BoundedQueue objects. - --autorun or --noautorun
-
By default, programs process automatically, but Fsdb::Filter objects in Perl do not run until you invoke the run() method. The
--(no)autorun
option controls that behavior within Perl. - --help
-
Show help.
- --man
-
Show full manual.
SAMPLE USAGE
Input:
#fsdb name id test1
a 1 80
b 2 70
c 3 65
d 4 90
e 5 70
f 6 90
Command:
cat DATA/grades.fsdb | dbcolpercentile test1
Output:
#fsdb name id test1 percentile
d 4 90 1
f 6 90 1
a 1 80 0.66667
b 2 70 0.5
e 5 70 0.5
c 3 65 0.16667
# | dbsort -n test1
# | dbcolpercentile test1
Command 2:
cat DATA/grades.fsdb | dbcolpercentile --rank test1
Output 2:
#fsdb name id test1 rank
d 4 90 1
f 6 90 1
a 1 80 3
b 2 70 4
e 5 70 4
c 3 65 6
# | dbsort -n test1
# | dbcolpercentile --rank test1
SEE ALSO
CLASS FUNCTIONS
new
$filter = new Fsdb::Filter::dbcolpercentile(@arguments);
Create a new dbcolpercentile object, taking command-line arguments.
set_defaults
$filter->set_defaults();
Internal: set up defaults.
parse_options
$filter->parse_options(@ARGV);
Internal: parse command-line arguments.
setup
$filter->setup();
Internal: setup, parse headers.
_count_rows
$n = $self->_count_rows()
Interpose a filter on $self-
{_in}> that counts the rows.
run
$filter->run();
Internal: run over each rows.
AUTHOR and COPYRIGHT
Copyright (C) 1991-2008 by John Heidemann <johnh@isi.edu>
This program is distributed under terms of the GNU general public license, version 2. See the file COPYING with the distribution for details.