NAME

fu-len - Filter and manipulate FASTA/FASTQ files based on sequence length

SYNOPSIS

fu-len [options] FILE1 [FILE2 ...]

DESCRIPTION

fu-len is a versatile tool for filtering sequences from FASTA/FASTQ files based on their length. It provides additional functionality for sequence reformatting and name manipulation. The tool can process both FASTA and FASTQ files, including gzipped files, and can handle input from standard input using '-' as the filename.

OPTIONS

Input/Output Control

Sequence Naming

Sequence Annotation

Other Options

EXAMPLES

Filter sequences by length:

# Keep sequences between 100 and 1000 bp
fu-len -m 100 -x 1000 input.fa > filtered.fa

Convert FASTQ to wrapped FASTA:

# Convert to FASTA and wrap to 60 characters per line
fu-len -f -w 60 input.fastq > output.fa

Number sequences with custom prefix:

# Add sequential numbers and length information
fu-len -n num -p 'seq' -l input.fa > numbered.fa

Process multiple files:

# Filter all sequences and force FASTA output
fu-len -m 500 -f file1.fq file2.fa > combined.fa

NOTES

When processing multiple files, be aware that:

MODERN ALTERNATIVE

This suite of tools has been superseded by SeqFu, a compiled program providing faster and safer tools for sequence analysis. This suite is maintained for the higher portability of Perl scripts under certain circumstances.

SeqFu is available at https://github.com/telatin/seqfu2, and can be installed with BioConda conda install -c bioconda seqfu

CITING

Telatin A, Fariselli P, Birolo G. SeqFu: A Suite of Utilities for the Robust and Reproducible Manipulation of Sequence Files. Bioengineering 2021, 8, 59. https://doi.org/10.3390/bioengineering8050059