NAME
Bio::AGP::LowLevel - functions for dealing with AGP files
SYNOPSIS
$lines_arrayref = agp_parse('my_agp_file.agp');
agp_write( $lines => 'my_agp_file.agp');
DESCRIPTION
functions for working with AGP files.
FUNCTIONS
All functions below are EXPORT_OK.
str_in
Usage: print "it's valid" if str_in($thingy,qw/foo bar baz/);
Desc : return 1 if the first argument is string equal to at least one of the
subsequent arguments
Ret : 1 or 0
Args : string to search for, array of strings to search in
Side Effects: none
I kept writing this over and over in validation code and got sick of it.
is_filehandle
Usage: print "it's a filehandle" if is_filehandle($my_thing);
Desc : check whether the given thing is usable as a filehandle.
I put this in a module cause a filehandle might be either
a GLOB or isa IO::Handle or isa Apache::Upload
Ret : true if it is a filehandle, false otherwise
Args : a single thing
Side Effects: none
agp_parse
Usage: my $lines = agp_parse('~/myagp.agp',validate_syntax => 1, validate_identifiers => 1);
Desc : parse an agp file
Args : filename or filehandle, hash-style list of options as
validate_syntax => if true, error
if there are any syntax errors,
validate_identifiers => if true, error
if there are any identifiers that
CXGN::Tools::Identifiers doesn't recognize
IMPLIES validate_syntax
error_array => an arrayref. if given, will push
error descriptions onto this array instead of
using warn to print them to stderr
Ret : undef if error, otherwise return an
arrayref containing line records, each of which is like:
{ comment => 'text' } if a comment,
or if a data line:
{ objname => the name of the object being assembled
(same for every record),
ostart => start coordinate for this component (object),
oend => end coordinate for this component (object),
partnum => the part number appearing in the 4th column,
linenum => the line number in the file,
type => letter type present in the file (/[ADFGNOPUW]/),
typedesc => description of the type, one of:
- (A) active_finishing
- (D) draft
- (F) finished
- (G) wgs_finishing
- (N) known_gap
- (O) other
- (P) predraft
- (U) unknown_gap
- (W) wgs_contig
ident => identifier of the component, if any,
length => length of the component,
is_gap => 1 if the line is some kind of gap, 0 if it
is covered by a component,
gap_type => one of:
fragment: gap between two sequence contigs (also
called a "sequence gap"),
clone: a gap between two clones that do not overlap.
contig: a gap between clone contigs (also called a
"layout gap").
centromere: a gap inserted for the centromere.
short_arm: a gap inserted at the start of an
acrocentric chromosome.
heterochromatin: a gap inserted for an especially
large region of heterochromatic sequence (may also
include the centromere).
telomere: a gap inserted for the telomere.
repeat: an unresolvable repeat.
cstart => start coordinate relative to the component,
cend => end coordinate relative to the component,
linkage => 'yes' or 'no', only set for type of 'N',
orient => '+', '-', 0, or 'na'
orientation of the component
relative to the object,
}
Side Effects: unless error_array is given, will print error
descriptions to STDERR with warn()
Example:
agp_write
Usage: agp_write($lines,$file);
Desc : writes a properly formatted AGP file
Args : arrayref of line records to write, with the line records being
in the same format as those returned by agp_parse above,
filename or filehandle to write to,
Ret : nothing meaningful
Side Effects: dies on failure. if you gave it a filehandle, does
not close it
Example:
agp_format_part( $record )
Format a single AGP part line (string terminated with a newline) from the given record hashref.
agp_contigs
Usage: my @contigs = agp_contigs( agp_parse($agp_filename) );
Desc : extract and number contigs from a parsed AGP file
Args : arrayref of AGP lines, like those returned by agp_parse() above
Ret : list of contigs, in the same order as they occur in the
file, formatted as:
[ agp_line_hashref, agp_line_hashref, ... ],
[ agp_line_hashref, agp_line_hashref, ... ],
...
AUTHOR(S)
Robert Buels
Sheena Scroggins