proteins_to_protein_families

Protein families contain a set of isofunctional homologs. proteins_to_protein_families can be used to look up is used to get the set of protein_families containing a specified protein. For performance reasons, you can submit a batch of proteins (i.e., a list of proteins), and for each input protein, you get back a set (possibly empty) of protein_families. Specific collections of families (e.g., FIGfams) usually require that a protein be in at most one family. However, we will be integrating protein families from a number of sources, and so a protein can be in multiple families.

Example:

proteins_to_protein_families [arguments] < input > output

The standard input should be a tab-separated table (i.e., each line is a tab-separated set of fields). Normally, the last field in each line would contain the identifer. If another column contains the identifier use

-c N

where N is the column (from 1) that contains the protein.

This is a pipe command. The input is taken from the standard input, and the output is to the standard output. For each protein, the family it belongs to is added at the end of the input line.

Documentation for underlying call

This script is a wrapper for the CDMI-API call proteins_to_protein_families. It is documented as follows:

$return = $obj->proteins_to_protein_families($proteins)

Parameter and return types

$proteins is a proteins
$return is a reference to a hash where the key is a protein and the value is a protein_families
proteins is a reference to a list where each element is a protein
protein is a string
protein_families is a reference to a list where each element is a protein_family
protein_family is a string

Command-Line Options

-c Column: This is used only if the column containing the subsystem is not the last column.
-i InputFile [ use InputFile, rather than stdin ]

Output Format

The standard output is a tab-delimited file. It consists of the input file with extra columns added.

Input lines that cannot be extended are written to stderr.

To install Bio::KBase, copy and paste the appropriate command in to your terminal.

cpanm

cpanm Bio::KBase

CPAN shell

perl -MCPAN -e shell
install Bio::KBase

For more information on module installation, please visit the detailed CPAN module installation guide.

	Global
`s`	Focus search bar
`?`	Bring up this help dialog

	GitHub
`g` `p`	Go to pull requests
`g` `i`	go to github issues (only if github is preferred repository)

	POD
`g` `a`	Go to author
`g` `c`	Go to changes
`g` `i`	Go to issues
`g` `d`	Go to dist
`g` `r`	Go to repository/SCM
`g` `s`	Go to source
`g` `b`	Go to file browse

	Search terms
module: (e.g. module:Plugin)
distribution: (e.g. distribution:Dancer auth)
author: (e.g. author:SONGMU Redis)
version: (e.g. version:1.00)

proteins_to_protein_families

Documentation for underlying call

Command-Line Options

Output Format

Module Install Instructions