NAME

Proch::N50 - Calculate N50 from a FASTA or FASTQ file without dependencies

VERSION

version 0.023

SYNOPSIS

use Proch::N50 qw(getStats getN50);
my $filepath = '/path/to/assembly.fasta';

# Get N50 only: getN50(file) will return an integer
print "N50 only:\t", getN50($filepath), "\n";

# Full stats
my $seq_stats = getStats($filepath);
print Data::Dumper->Dump( [ $seq_stats ], [ qw(*FASTA_stats) ] );
# Will print:
# %FASTA_stats = (
#               'N50' => 65,
#               'dirname' => 'data',
#               'size' => 130,
#               'seqs' => 6,
#               'filename' => 'small_test.fa',
#               'status' => 1
#             );

# Get also a JSON object
my $seq_stats_with_JSON = getStats($filepath, 'JSON');
print $seq_stats_with_JSON->{json}, "\n";
# Will print:
# {
#    "seqs" : 6,
#    "status" : 1,
#    "filename" : "small_test.fa",
#    "N50" : "65",
#    "dirname" : "data",
#    "size" : 130
# }

NAME

Proch::N50 - a small module to calculate N50 (total size, and total number of sequences) for a FASTA or FASTQ file. It's small and without dependencies.

METHODS

getN50(filepath)

This function returns the N50 for a FASTA/FASTQ file given, or 0 in case of error(s).

getStats(filepath, alsoJSON)

Calculates N50 and basic stats for <filepath>. Returns also JSON if invoked with a second parameter. This function return a hash reporting:

size (int)

total number of bp in the files

N50 (int)

the actual N50

seqs (int)

total number of sequences in the files

filename (string)

file basename of the input file

dirname (string)

name of the directory containing the input file

json (string: JSON pretty printed)

(pretty printed) JSON string of the object (only if JSON is installed)

Dependencies

JSON (optional)
Term::ANSIColor (optional; for a demo script)

AUTHOR

Andrea Telatin <andrea@telatin.com>, Quadram Institute Bioscience

COPYRIGHT AND LICENSE

This free software under MIT licence. No warranty, explicit or implicit, is provided.

AUTHOR

Andrea Telatin <andrea.telatin@quadram.ac.uk>

COPYRIGHT AND LICENSE

This software is Copyright (c) 2019 by Andrea Telatin.

This is free software, licensed under:

The MIT (X11) License