SYNOPSIS

n50.pl [options] [FILE1 FILE2 FILE3...]

DESCRIPTION

This program parses a list of FASTA/FASTQ files calculating for each one the number of sequences, the sum of sequences lengths and the N50, N75, N90 and auN*. It will print the result in different formats, by default only the N50 is printed for a single file and all metrics in TSV format for multiple files.

*: See https://lh3.github.io/2020/04/08/a-new-metric-on-assembly-contiguity

PARAMETERS

Output formats

These are the values for --format.

EXAMPLE USAGES

Screen friendly table (-x is a shortcut for --format screen), sorted by N50 descending (default):

n50.pl -x files/*.fa

Screen friendly table, sorted by total contig length (--sortby max) ascending (--reverse):

n50.pl -x -o max -r files/*.fa

Tabular (tsv) output is default:

n50.pl -o max -r files/*.fa

A custom output format:

n50.pl data/*.fa -f custom -t '{path}{tab}N50={N50};Sum={size}{new}'

CITING

Telatin A, Fariselli P, Birolo G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. https://doi.org/10.3390/bioengineering8050059

CONTRIBUTING, BUGS

The repository of this project is available at https://github.com/telatin/proch-n50/.