SYNOPSIS

buildGFF3FromEnsembl.pl [-h|--f] [--output <output_file>] [--est] <genome> 
The mandatory argument is a genome which is indexed in Ensembl GB. 
For example:
         'Homo Sapiens' for Human,
         'Pan troglodytes' for Chimpanzee,
         'Mus musculus' for Mouse,
         'Macaca mulatta' for Macaque,
         'Pongo pygmaeus' for Orangutan,
          etc (cf http://www.ensembl.org/info/about/species.html)
--output: put the filename to write the gff3 output (STDOUT by default)  
--est: build GFF3 from Ensembl API with OtherFeatures DB (Core DB by default)

OPTIONS

   -h, --help, --fullhelp
   --output=I<output_file>
   --est
   
   make a GFF3 file on <output_file>
     column 1: <seqname> 
               The name of the sequence. Commonly, this is the chromosome ID or
               contig ID. Note that the coordinates used must be unique within
               each sequence name in all GTFs for an annotation set.

     column 2: <source>
               The source column should be a unique label indicating where the 
               annotations came from Ensembl.
     column 3: <feature>
               exon, cds, five, three, gene or mRNA
     column 4: <start exon>
               Start coordinates of the feature relative to the beginning of the 
               sequence named in <seqname>. 
     column 5: <end exon>
               End coordinates of the feature relative to the beginning of the 
               sequence named in <seqname>. 
     column 6: <score>
               .
     column 7: <strand>
               strand of the exon relative to the genome, ie - or +
     column 8: <frame>
               .
     column 9: a list of binome <key "value"> separated by a semicolon ";". 
               A GFF file has the same three mandatory attributes at the end 
               of the record (Note that other attributes are optional):
                 -ID=value                      A globally unique identifier for the feature.
                 -Parent=value1,...,valueN      A list of identifier(s) for the parent(s) of the feature.
                 -Name=value                    The HGNC name of the gene 
              
               This script define the following attributes:
              
                 -transcripts_nb=value          The number of transcripts contained in the gene
                 -exons_nb=value                The number of exons contained in the transcript/gene
                 -exon_rank=value               The rank of the exon contained in the gene
                 -type "prefix:value"           The nature of the mRNA where the "prefix" 
                                                represents a first class level (protein_coding, 
                                                small_ncRNA, lincRNA, other_lncRNA, other_noncodingRNA)
                                                and "value" is the biotype defined by Ensembl. 

REQUIRES

Perl5.
Bio::EnsEMBL
Getopt::Long
Pod::Usage

AUTHOR

Nicolas PHILIPPE <nicolas.philippe@inserm.fr>