NAME Mashtree
SYNOPSIS
Helps run a mashtree analysis to make rapid trees for genomes. Please see github.com/lskatz/Mashtree for more information.
- mashtree executables
-
This document covers the Mashtree library, but the highlight the mashtree package is the executable `mashtree`. See github.com/lskatz/Mashtree for more information.
Fast method:
mashtree --numcpus 12 *.fastq.gz [*.fasta] > mashtree.dnd
More accurate method:
mashtree --mindepth 0 --numcpus 12 *.fastq.gz [*.fasta] > mashtree.dnd
Bootstrapping and jackknifing
mashtree_bootstrap.pl --reps 100 --numcpus 12 *.fastq.gz -- --min-depth 0 > mashtree.jackknife.dnd mashtree_jackknife.pl --reps 100 --numcpus 12 *.fastq.gz -- --min-depth 0 > mashtree.jackknife.dnd
VARIABLES
- $VERSION
- $MASHTREE_VERSION (same value as $VERSION)
- @fastqExt = qw(.fastq.gz .fastq .fq .fq.gz)
- @fastaExt = qw(.fasta .fna .faa .mfa .fas .fsa .fa)
- @bamExt = qw(.sorted.bam .bam)
- @vcfExt = qw(.vcf.gz .vcf)
- @mshExt = qw(.msh)
- @richseqExt = qw(.gb .gbank .genbank .gbk .gbs .gbf .embl .ebl .emb .dat .swiss .sp)
-
Used to mark whether a file is being read, so that Mashtree limits disk I/O
METHODS
- $SIG{'__DIE__'}
-
Remakes how `die` works, so that it references the caller
- logmsg
-
Prints a message to STDERR with the thread number and the program name, with a trailing newline.
- openFastq
-
Opens a fastq file in a thread-safe way.
- _truncateFilename
-
Removes fastq extension, removes directory name,
- distancesToPhylip
-
1. Read the mash distances 2. Create a phylip file
Arguments: hash of distances, output directory, settings hash
- sortNames
-
Sorts names.
Arguments:
1. $name - array of names 2. $settings - options * $$settings{'sort-order'} is either "abc", "random", "input-order"
- createTreeFromPhylip($phylip, $outdir, $settings)
-
Create tree file with Quicktree but bioperl as a backup.
- treeDist($treeObj1, $treeObj2)
-
Lee's implementation of a tree distance. The objective is to return zero if two trees are the same.
- mashDist($file1, $file2, $k, $settings)
-
Find the distance between two mash sketch files Alternatively: two hash lists.
- mashHashes($sketch)
-
Return an array of hashes, the kmer length, and the genome estimated length
- raw_mash_distance_unequal_sizes($hashes1, $hashes2)
-
Compare unequal sized hashes. Treat the first set of hashes as the reference (denominator) set.
- raw_mash_distance($hashes1, $hashes2)
-
Return the number of kmers in common and the number compared total. inspiration from https://github.com/onecodex/finch-rs/blob/master/src/distance.rs#L34
- transfer_bootstrap_expectation
-
Title : transfer_bootstrap_expectation Usage : my $tree_with_bs = transfer_bootstrap_expectation(\@bs_trees,$guide_tree); Function: Calculates the Transfer Bootstrap Expectation (TBE) for internal nodes based on the methods outlined in Lemoine et al, Nature, 2018. Currently experimental. Returns : L<Bio::Tree::TreeI> Args : Arrayref of L<Bio::Tree::TreeI>s Guide tree, L<Bio::Tree::TreeI>s