close_genomes

Example:

close_genomes [arguments] < input > output

This is a strange command. It has two quite distinct uses:

1. it can be used to find genomes close to existing genomes (stored in either
   the KBase CS or the PubSEED).  
2. Alternatively, it can be used to compute close genomes for a new genome
   encoded in a JSON file.

The second use will be performed iff the

-g Encoded_JSON_directory

is used. In that case, the updated genome directory will be written to STDOUT.

If the input is to be one or more genomes from the CS, then the standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain the identifer. If another column contains the identifier use

-c N

where N is the column (from 1) that contains the identifier.

This is a pipe command. The input is taken from the standard input, and the output is to the standard output.

Documentation for underlying call

This script is a wrapper for the CDMI-API call close_genomes. It is documented as follows:

$return = $obj->close_genomes($genomes, $how, $n)
Parameter and return types
$genomes is a genomes
$how is a how
$n is an int
$return is a reference to a hash where the key is a genome and the value is a genomes
genomes is a reference to a list where each element is a genome
genome is a string
how is an int

Command-Line Options

-c Column

This is used only if the column containing the identifier is not the last column.

-i InputFile [ use InputFile, rather than stdin ]
-n N [ N is the number of close genomes desired ]

Output Format

If close genomes are being computed for genomes in the CS, then the input is a tab-delimited file, and the output will have two extra columns: [projected degree of identity,close-genome]. If the -g option is used, then an updated genome structure will be encoded and written to STDOUT.