sandy logo

A straightforward and complete next-generation sequencing read simulator

Sandy is a bioinformatics tool that provides a simple engine to simulate next-generation sequencing (NGS) reads for genomic and transcriptomic pipelines. Simulated data works as experimental control - a key step to optimize NGS analysis - in comparison to hypothetical models. Sandy is a straightforward, easy-to-use, fast and highly customizable tool that generates reads requiring only a fasta file as input. Sandy can simulate single-end and paired-end reads from both DNA and RNA sequencing as if produced from the most used second and third-generation platforms. The tool also tracks a built-in database with predefined models extracted from real data for sequencer quality-profiles (i.e. Illumina hiseq, miseq, nextseq), expression-matrices generated from GTExV8 data for 54 human tissues, and genomic-variations such as SNVs and Indels from 1KGP and gene fusions from COSMIC.

For full documentation, please visit https://galantelab.github.io/sandy/.

Features

Installation

There are two recommended ways to obtain Sandy: Pulling the official Docker image and installing through CPAN.

Docker

Assuming that docker is already installed on your server, simply run the command:

$ docker pull galantelab/sandy

For more details, see docker/README.md file.

CPAN

Prerequisites

Along with perl, you must have zlib, gcc, make and cpanm packages installed:

Installing with cpanm

Install Sandy with the following command:

% cpanm App::Sandy

If you concern about speed, you can avoid testing with the flag --notest:

% cpanm --notest App::Sandy

For more details, see INSTALL file

Acknowledgments

| Institution | Site | | :-- | :-: | | Coordination for the Improvement of Higher Level Personnel | CAPES | | The São Paulo Research Foundation | FAPESP | | Teaching and Research Institute from Sírio-Libanês Hospital | Galantelab |

License

This is free software, licensed under:

The GNU General Public License, Version 3, June 2007