NAME

Tutorial_pipeline00.pl - Infering qualkity parameter form a BAM file.

SYNOPSIS

Tutorial_pipeline00.pl

DESCRIPTION

Tutorial how to use Bio::ViennaNGS::BamStat and Bio::ViennaNGS::BamStatSummary to retrieve quality and quantitiy statistics from input BAM file.

Prcedure

Include libraries

use Data::Dumper;
use Bio::ViennaNGS::BamStat;
use Bio::ViennaNGS::BamStatSummary;
  • Data::Dumper

    For easy access to complex data structure.

  • Bio::ViennaNGS::BamStat

    Extracts quality and quantity paramters from an input BAM file.

  • Bio::ViennaNGS::BamStatSummary

    Sumarizes, compares and plots data compiled by Bio::ViennaNGS::BamStat.

Define control variables

@bams     = qw# C1R2.bam #;
$odir     = '.';
$rlibpath = '/usr/bin/R';

$edit_control     = 1;
$segemehl_control = 1;
  • @bams

    Array with all BAM files, including their path, to be processed. In the course of this tutorial please retrieve the C1R1.bam file from xxxxxx and store it in the current working directory. Other inputs will not be accepted.

  • $odir

    Path to the directory where the output files will be created. If files with same names do already exist in this particular directory, they will be overwritten.

  • $rlibpath

    Path to the installation of R.

  • $edit_control

    Control flag. Set to 1 if a statistics of the edit distance of each read should be reported. Otherwise set to 0;

  • $segemehl_control

    Control flag. Set to 1 if the input BAM file was produced by the short read mapper segemehl. Takes care of segemehl specific BAM dialect issues. Otherwise set to 0;

Creating new BamStatSummary object.

 $bamsummary = Bio::ViennaNGS::BamStatSummary->new(files          => \@bams,
						   outpath        => $odir,
						   rlib           => $rlibpath,
						   is_segemehl    => 1,
						   control_edit   => 1,
						  );
  • Options

    Initialize new BamStatSummary object representing data from all segemehl BAM files in @bams, setting the output directory to $odir, where beside standard read quantification also the edit distance of each read will be reported.

Read-in BAM files @bams

Processes each BAM file in @bams and compiles the relevant data into $bamsummary.

$bamsummary->populate_data();

Thereby, for each BAM file in @bams the method new from BIO::ViennaNGS::BamStat is called like this

$bo = Bio::ViennaNGS::BamStat->new(bam => $bamfile);

Use print Dumper($bamsummary); to check the object.

Quantify data from $bamsummary

Compiles quantitative information for all reads stored in $bamsummary.

$bamsummary->populate_countStat();

Use print Dumper($bamsummary); to check the object.

Produce output for read quantification.

Create file for the read quantification in $odir. File formate is *.csv.

$bamsummary->dump_countStat("csv");

Plot read quantification.

Create a barplot for the read quantification in $odir. File formate is *.pdf.

$bamsummary->make_BarPlot();

Plot edit distance distribution.

Create a boxplot of the distribution of edit distances for all reads and all samples in @bams.

$bamsummary->make_BoxPlot("data_edit") if($bamsummary->control_edit  && $bamsummary->has_control_edit);

AUTHOR

Fabian Amman <fabian@tbi.univie.ac.at>