NAME
SimpleFlow - easy, simple workflow manager (and logger); for keeping track of and debugging large and complex shell command workflows
SYNOPSIS
This is similar to snakeMake or NextFlow, but running in Perl. The simplest use case is
my $t = task({
cmd => 'which ls'
});
All tasks return a hash, showing at a minimum 1) exit code, 2) the directory that the job was done in, 3) stderr, and 4) stdout.
the only required key/argument is `cmd`, but other arguments are possible:
die # die if not successful; 'true' or 'false'
input.files # check for input files before running; SCALAR or ARRAY
log.fh # print to filehandle
note # a note for the log
overwrite # overwrite previously existing files: "true" or "false"
output.files # product files that need to be checked; SCALAR or ARRAY
You may wish to output results to a logfile using a previously opened filehandle thus:
my ($fh, $fname) = tempfile( UNLINK => 0, DIR => '/tmp');
my $t = task({
cmd => 'which ln',
'log.fh' => $fh,
'note' => 'testing where ln comes from',
'output.files' => $fname,
overwrite => 1
});
close $fh;
Examples
Consider a very complex pipeline in which mistakes are *very* easily made, and there are numerous files to keep track of. SimpleFlow is designed to simplify these steps with a script, with automated checks at every step, in a very intuitive way:
my $g_tpr = "3md.$group.tpr";
task({
cmd => "echo $val | gmx convert-tpr -s 3md.tpr -o $g_tpr -n cpx.ndx",
'input.files' => ['3md.tpr', 'cpx.ndx'],
'log.fh' => $log,
'output.files'=> $g_tpr, # only do this once
overwrite => 'true'
});
my $subset_xtc = "3md.$group.$n.xtc";
task({
cmd => "echo $val | gmx trjconv -s $g_tpr -f 3md_out$n.xtc -o $subset_xtc -n cpx.ndx",
'input.files' => ["3md_out$n.xtc", $g_tpr],
'log.fh' => $log,
'output.files' => $subset_xtc,
overwrite => 'true'
});
my $gro = "3md.$group.$n.gro";
task({
'input.files' => [$g_tpr, $subset_xtc],
'log.fh' => $log,
cmd => "echo $val | gmx trjconv -s $g_tpr -f $subset_xtc -o $gro",
'output.files' => $gro,
overwrite => 'true'
});
mkdir "xvg/$group" unless -d "xvg/$group";
my $dir = "xvg/$group/" . sprintf '%u', $n;
mkdir $dir unless -d $dir;
task({
cmd => "gmx chi -s $g_tpr -f $subset_xtc -phi -psi -all",
'log.fh' => $log,
'input.files' => [$gro, $subset_xtc],
overwrite => 'true'
});
foreach my $f (list_regex_files('\.xvg$')) {
rename $f, "$dir/$f";
say2("Moved $f to $dir/$f", $log);
}
Every `task` returns a hash, which is printed to a log if specified:
{
cmd "gmx chi -s 3md.Receptor.tpr -f 3md.Receptor.09.xtc -phi -psi -all",
die 1,
dir "/home/con/ui/pipelinePepPriML/default/2puy",
done "now",
dry.run 0,
duration 0.0776150226593018,
exit 0,
input.file.size {
3md.Receptor.09.gro 12874711,
3md.Receptor.09.xtc 2208928
},
input.files [
[0] "3md.Receptor.09.gro" (dualvar: 3),
[1] "3md.Receptor.09.xtc" (dualvar: 3)
],
note "",
output.files [],
overwrite "true",
source.file "0.sanity.check.pl",
source.line 73,
stderr " :-) GROMACS - gmx chi, 2025.3 (-:
Executable: /home/con/prog/gromacs-2025.3/build/bin/gmx
Data prefix: /home/con/prog/gromacs-2025.3 (source tree)
Working dir: /home/con/ui/pipelinePepPriML/default/2puy
Command line:
gmx chi -s 3md.Receptor.tpr -f 3md.Receptor.09.xtc -phi -psi -all
Reading file 3md.Receptor.tpr, VERSION 2025.3 (single precision)
Reading file 3md.Receptor.tpr, VERSION 2025.3 (single precision)
Analyzing from residue 1 to residue 61
60 residues with dihedrals found
305 dihedrals found
Reading frame 500 time 500.000
j after resetting (nr. active dihedrals) = 179
Printing psiMET19.x(...skipping 1338 chars...)",
stdout "",
will.do "done"
}