NAME
OBO::CCO::UniProtParser - A UniProt to OBO translator.
DESCRIPTION
Includes methods for adding information from UniProt files to ontologies
UniProt files can be obtained from:
ftp://ftp.expasy.org/databases/uniprot/knowledgebase/
The method 'work' incorporates relevant data from a UniProt file into the input ontology, writes the ontology into an OBO file, writes map files.
This method assumes:
- the input ontology contains already the term 'gene', 'protein', 'cell cycle modified protein'
- the input ontology already contains relevant protein terms.
- the input ontology already contains the NCBI taxonomy.
- the input ontology already contains the relationship types 'is_a', 'encoded_by', 'codes_for', 'originates_from', 'tranformation_of', 'source_of'
- the input UniProt file contains entries for one species only and for protein terms present in the input ontology only
- the full map file ($long_file_name, the UNION of the species specific map files ($short_file_name)) contains all the proteins to be processed by the UniProtParser
AUTHOR
Vladimir Mironov vlmir@psb.ugent.be
COPYRIGHT AND LICENSE
Copyright (C) 2006 by Vladimir Mironov
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.7 or, at your option, any later version of Perl 5 you may have available.
work
Usage - $UniProtParser->work($ref_file_names, 'Arabidopsis thaliana organism')
Returns - updated OBO::Core::Ontology object
Args - 1. reference to a list of filenames:
- input OBO file,
- output OBO file,
- UniProt file,
- CCO_id/protein_name map file one taxon only,
- CCO_id/protein_name map file all taxa,
- CCO_id/gene_name map file one taxon only,
- CCO_id/gene_name map file all taxa,
2. taxon_name
Function - parses a Uniprot file, adds relevant information to the input ontology, writes OBO and map files