NAME
get_bam_seq_stats.pl
A script to report the alignment sequence nucleotide frequencies.
SYNOPSIS
get_bam_seq_stats.pl <file.bam>
Options:
--in <file.bam>
--out <filename>
--version
--help
OPTIONS
The command line flags and descriptions:
- --in <file.bam>
-
Specify the file name of a binary Bam file of alignments as described for Samtools. It will be automatically indexed if necessary.
- --out <filename>
-
Optionally specify the base name of the output file. The default is to use input base name. The output file names are appended with '.seq_stats.txt'.
- --version
-
Print the version number.
- --help
-
Display the POD documentation
DESCRIPTION
This program will generate some statistics about the alignment sequences associated with a Bam file. This is using the the query sequence reported in the Bam file, not the genomic sequence or alignment. Only aligned sequences are analyzed.
The number and fraction of total for each length of the query sequences are reported. Additionally, the nucleotide composition for each position in the query sequences are also reported in a table, which should be suitable for generating a sequence logo, if desired.
The input file must be a BAM file as described by the Samtools project (http://samtools.sourceforge.net).
AUTHOR
Timothy J. Parnell, PhD
Dept of Oncological Sciences
Huntsman Cancer Institute
University of Utah
Salt Lake City, UT, 84112
This package is free software; you can redistribute it and/or modify it under the terms of the GPL (either version 1, or at your option, any later version) or the Artistic License 2.0.