NAME

Bio::Community::Tools::Rarefier - Normalize communities by count

SYNOPSIS

use Bio::Community::Tools::Rarefier;

# Normalize communities in a metacommunity by repeatedly taking 1,000 random members
my $rarefier = Bio::Community::Tools::Rarefier->new(
   -metacommunity => $meta,
   -sample_size   => 1000,
   -threshold     => 0.001, # stop bootstrap iterations when threshold is reached
);

# Rarefied results, with decimal counts
my $average_community = $rarefier->get_avg_meta->[0];

# Round counts to integer numbers
my $representative_community = $rarefier->get_repr_meta->[0];


# Alternatively, specify a number of repetitions
my $rarefier = Bio::Community::Tools::Rarefier->new(
   -metacommunity   => $meta,
   -sample_size     => 1000,
   -num_repetitions => 0.001, # stop after this number of bootstrap iterations
);

# ... or assume an infinite number of repetitions
my $rarefier = Bio::Community::Tools::Rarefier->new(
   -metacommunity   => $meta,
   -sample_size     => 1000,
   -num_repetitions => 'inf',
);

DESCRIPTION

This module takes a metacommunity and normalizes (rarefies) the communities it contains by their number of counts.

Comparing the composition and diversity of biological communities can be biased by sampling artefacts. When comparing two identical communities, one for which 10,000 counts were made to one, to one with only 1,000 counts, the smaller community will appear less diverse. A solution is to repeatedly bootstrap the larger communities by taking 1,000 random members from it.

This module uses Bio::Community::Sampler to take random member from communities and normalize them by their number of counts. After all random repetitions have been performed, average communities or representative communities are returned. These communities all have the same number of counts.

AUTHOR

Florent Angly florent.angly@gmail.com

SUPPORT AND BUGS

User feedback is an integral part of the evolution of this and other Bioperl modules. Please direct usage questions or support issues to the mailing list, bioperl-l@bioperl.org, rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible.

If you have found a bug, please report it on the BioPerl bug tracking system to help us keep track the bugs and their resolution: https://redmine.open-bio.org/projects/bioperl/

COPYRIGHT

Copyright 2011-2014 by Florent Angly <florent.angly@gmail.com>

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.10.1 or, at your option, any later version of Perl 5 you may have available.

APPENDIX

The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

new

Function: Create a new Bio::Community::Tool::Rarefier object
Usage   : my $rarefier = Bio::Community::Tool::Rarefier->new( );
Args    : -metacommunity  : see metacommunity()
          -num_repetitions: see num_repetitions()
          -sample_size    : see sample_size()
          -seed           : see set_seed()
Returns : a new Bio::Community::Tools::Rarefier object

metacommunity

Function: Get or set the metacommunity to normalize.
Usage   : my $meta = $rarefier->metacommunity;
Args    : A Bio::Community::Meta object
Returns : A Bio::Community::Meta object

sample_size

Function: Get or set the sample size, i.e. the number of members to pick randomly
          at each iteration. It has to be smaller than the total count of the
          smallest community or an error will be generated. If the sample size
          is omitted, it defaults to the get_members_count() of the smallest community.
Usage   : my $sample_size = $rarefier->sample_size;
Args    : integer for the sample size
Returns : integer for the sample size

threshold

Function: Get or set the threshold. While iterating, when the beta diversity or
          distance between the average community and the average community at
          the previous iteration decreases below this threshold, the
          bootstrapping is stopped. By default, the threshold is 1e-5. The
          num_repetitions() method provides an alternative way to specify when
          to stop the computation. After communities have been normalized using
          the num_repetitions() method instead of the threshold() method, the
          beta diversity between the last two average communities repetitions
          can be accessed using the threshold() method.
Usage   : my $threshold = $rarefier->threshold;
Args    : positive integer for the number of repetitions
Returns : positive integer for the (minimum) number of repetitions

num_repetitions

Function: Get or set the number of bootstrap repetitions to perform. When given,
          instead of relying on the threshold() to determine when to stop
          repeating the bootstrap process, perform an arbitrary number of
          repetitions. After communities have been normalized by count using
          threshold() method, the number of repetitions actually done can be
          accessed using this method. As a special case, specify 'inf' to
          simulate an infinite number of repetitions.
Usage   : my $repetitions = $rarefier->repetitions;
Args    : positive integer or 'inf' number of repetitions
Returns : positive integer for the (minimum) number of repetitions

get_seed, set_seed

Usage   : $sampler->set_seed(1234513451);
Function: Get or set the seed used to pick the random members.
Args    : Positive integer
Returns : Positive integer

verbose

Function: Get or set verbose mode. In verbose mode, the current number of
          iterations (and beta diversity if a threshold is used) is displayed.
Usage   : $rarefier->verbose(1);
Args    : 0 (default) or 1
Returns : 0 or 1

drop

Function: Get or set drop mode. In drop mode, this module silently drops
          communities that do not have enough members instead of reporting an
          error.
Usage   : $rarefier->drop(1);
Args    : 0 (default) or 1
Returns : 0 or 1

get_avg_meta

Function: Calculate an average metacommunity.
Usage   : my $meta = $rarefier->get_avg_meta;
Args    : none
Returns : Bio::Community::Meta object

get_repr_meta

Function: Calculate a representative metacommunity.
Usage   : my $meta = $rarefier->get_repr_meta;
Args    : none
Returns : Bio::Community::Meta object