NAME
graph_histogram.pl
A script to graph a histogram (bar or line) of one or more datasets.
SYNOPSIS
graph_histogram.pl --bins <integer> --size <number> <filename>
graph_histogram.pl --bins <integer> --max <number> <filename>
graph_histogram.pl --size <number> --max <number> <filename>
Options:
--in <filename>
--index <column_index>
--bins <integer>
--size <number>
--min <number>
--max <number>
--ymax <integer>
--yticks <integer>
--skip <integer>
--offset <integer>
--format <integer>
--lines
--out <base_filename>
--dir <output_directory>
--version
--help
OPTIONS
The command line flags and descriptions:
- --in <filename>
-
Specify an input file containing either a list of database features or genomic coordinates for which to collect data. The file should be a tab-delimited text file, one row per feature, with columns representing feature identifiers, attributes, coordinates, and/or data values. The first row should be column headers. Text files generated by other BioToolBox scripts are acceptable. Files may be gzipped compressed.
- --index <column_index>
-
Specify the column number(s) corresponding to the dataset(s) in the file to graph. Number is 0-based index. Each dataset should be demarcated by a comma. A range of indices may also be specified using a dash to demarcate the beginning and end of the inclusive range. Two datasets may also be graphed together; these indices should be joined with an ampersand. For example, "2,4-6,5&6" will individually graph datasets 2, 4, 5, 6, and a combination 5 and 6 graph.
If no dataset indices are specified, then they may be chosen interactively from a list.
- --bins <integer>
-
Specify the number of bins or partitions into which the data will be grouped. This argument is optional if --max and --size are provided.
- --size <number>
-
Specify the size of each bin or partition. A decimal number may be provided. This argument is optional if --bins and --max are provided.
- --min <number>
-
Optionally indicate the minimum value of the bins. When generating the list of bins, this is used as the starting value. Default is 0. A negative number may be provided using the format --min=-1.
- --max <number>
-
Specify the maximum bin value. This argument is optional if --bins and --size are provided.
- --ymax <integer>
-
Specify the maximum Y axis value. The default is automatically determined.
- --yticks <integer>
-
Specify explicitly the number of major ticks for the Y axes. The default is 4.
- --skip <integer>
-
Specify the ordinal number of X axis major ticks to label. This avoids overlapping labels. The default is 4 (every 4th tick is labeled).
- --offset <integer>
-
Specify the number of X axis ticks to skip at the beginning before starting to label them. This may help in adjusting the look of the graph. The default is 0.
- --format <integer>
-
Specify the number of decimal places the X axis labels should be formatted. The default is the number of decimal places in the bin size parameter.
- --lines
-
Optionally specify a line graph to be generated instead of the default vertical bar graph.
- --out
-
Optionally specify the output filename prefix. The default value is "distribution_".
- --dir
-
Optionally specify the name of the target directory to place the graphs. The default value is the basename of the input file appended with "_graphs".
- --version
-
Print the version number.
- --help
-
Print this help documenation
DESCRIPTION
This program will generate PNG graphic files representing the histogram of the values in one or two datasets. The size of each bin or partition must be provided, as well as either the number of bins or the maximum bin value. The resulting files are written to a subdirectory named after the input file. The files are named after the dataset name (column header) with a prefix.
AUTHOR
Timothy J. Parnell, PhD
Howard Hughes Medical Institute
Dept of Oncological Sciences
Huntsman Cancer Institute
University of Utah
Salt Lake City, UT, 84112
This package is free software; you can redistribute it and/or modify it under the terms of the GPL (either version 1, or at your option, any later version) or the Artistic License 2.0.