NAME

megatree-ncbi-loader - Loads the NCBI taxonomy dump into a database

SYNOPSIS

megatree-ncbi-loader -nodes <file> -names <file> -d <file> [-vhm]

OPTIONS

-no <file> or -nodes <file>

Location of the nodes.dmp file from the NCBI taxonomy dump, i.e. as contained in the archive located here as of 2017-02-03: ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip

-na <file> or -names <file>

Location of the names.dmp file from the NCBI taxonomy dump, i.e. as contained in the archive located here as of 2017-02-03: ftp://ftp.ncbi.nih.gov/pub/taxonomy/taxdmp.zip

-d <file> or -dbfile <file>

Location of a database file, compatible with sqlite3, which will be produced. This file can not yet exist. If it does, an error message will be emitted and the program will quit.

-v or -verbose

Optional.

With this option, more feedback messages are written during processing. This option can be used multiple times, which increases the verbosity further.

-h or -help

Optional.

Prints help message / documentation.

-m or -man

Optional.

Prints manual page. Additional information is available in the documentation, i.e. perldoc megatree-ncbi-loader

DESCRIPTION

This program produces a database file from the NCBI taxonomy dump. Such a database provides much quicker random access to the taxonomy tree then by processing the flat files. The example trees that are referred to by the release of Bio::Phylo::Forest::DBTree have been produced in this way. They can be accessed by an API that is compatible with Bio::Phylo, but much more scalable. An example of such API usage is presented by the megatree-pruner script.