NAME

HPC::Runner::Slurm - Job Submission to Slurm

VERSION

Version 0.01

SYNOPSIS

Indepth documentation is at https://wcmc-q.atlassian.net/wiki/display/HPCSLURM/HPC-Runner-Slurm .

package Main;
extends 'HPC::Runner::Slurm';

Main->new_with_options(infile => '/path/to/commands');

This module is a wrapper around sbatch and can be used to submit arbirtary bash commands to slurm.

It has two levels of management. The first is the main sbatch command, and the second is the actual job, which runs commands in parallel, controlled by HPC::Runner::Threads or HPC::Runner::MCE.

It supports job dependencies. Put in the command 'wait' to tell slurm that some job or jobs depend on some other jobs completion. Put in the command 'newnode' to tell HPC::Runner::Slurm to submit the job to a new node.

The only necessary option is the --infile.

Submit Script

cmd1
cmd2 && cmd3
cmd4 \
--option cmd4 \
#Tell HPC::Runner::Slurm to put in some job dependencies.
wait
cmd5
#Tell HPC::Runner::Slurm to pass things off to a new node, but this job doesn't depend on the previous
newnode
cmd6

User Options

User options can be passed to the script with script --opt1 or in a configfile. It uses MooseX::SimpleConfig for the commands

configfile

Config file to pass to command line as --configfile /path/to/file. It should be a yaml or xml (untested) This is optional. Paramaters can be passed straight to the command line

example.yml

---
infile: "/path/to/commands/testcommand.in"
outdir: "path/to/testdir"
module:
    - "R2"
    - "shared"

infile

infile of commands separated by newline

example.in

cmd1
cmd2 --input --input \
--someotherinput
wait
#Wait tells slurm to make sure previous commands have exited with exit status 0.
cmd3  ##very heavy job
newnode
#cmd3 is a very heavy job so lets start the next job on a new node

module

modules to load with slurm Should use the same names used in 'module load'

Example. R2 becomes 'module load R2'

jobname

Specify a job name, and jobs will be jobname_1, jobname_2, jobname_x

afterok

The afterok switch in slurm. --afterok 123 will tell slurm to start this job after job 123 has completed successfully.

cpus_per_task

slurm item --cpus_per_task defaults to 8, which is probably fine

commands_per_node

--commands_per_node defaults to 8, which is probably fine

partition

#Should probably have something at some point that you can specify multiple partitions....

Specify the partition. Defaults to the partition that has the most nodes.

nodelist

Defaults to the nodes on the defq queue

submit_slurm

Bool value whether or not to submit to slurm. If you are looking to debug your files, or this script you will want to set this to zero. Don't submit to slurm with --nosubmit_to_slurm from the command line or $self->submit_to_slurm(0); within your code

template_file

actual template file

One is generated here for you, but you can always supply your own with --template_file /path/to/template

serial Option to run all jobs serially, one after the other, no parallelism The default is to use 4 procs

user

user running the script. Passed to slurm for mail information

use_threads

Bool value to indicate whether or not to use threads. Default is uses processes

If using threads your perl must be compiled to use threads!

use_processes

Bool value to indicate whether or not to use processes. Default is uses processes

Internal Variables

You should not need to mess with any of these.

template

template object for writing slurm batch submission script

cmd_counter

keep track of the number of commands - when we get to more than commands_per_node restart so we get submit to a new node.

node_counter

Keep track of which node we are on

batch_counter

Keep track of how many batches we have submited to slurm

node

Node we are running on

cmd

Current command specified by infile

batch

List of commands to submit to slurm

cmdfile

File of commands for mcerunner/parallelrunner Is cleared at the end of each slurm submission

slurmfile

File generated from slurm template

jobref

Array of arrays details slurmjob id. Index -1 is the most recent job submissisions, and there will be an index -2 if there are any job dependencies

wait

Boolean value indicates any job dependencies

SUBROUTINES/METHODS

run()

First sub called Calling system module load * does not work within a screen session!

check_files()

Check to make sure the outdir exists. If it doesn't exist the entire path will be created

get_nodes

Get the nodes from sinfo if not supplied

If the nodelist is supplied partition must be supplied

parse_file_slurm

Parse the file looking for the following conditions

lines ending in `\` wait nextnode

Batch commands in groups of $self->cpus_per_task, or smaller as wait and nextnode indicate

work

Get the node #may be removed but we'll try it out Process the batch Submit to slurm Take care of the counters

process_batch()

Create the slurm submission script from the slurm template Write out template, submission job, and infile for parallel runner

submit_slurm()

Submit jobs to slurm queue using sbatch.

This subroutine was just about 100% from the following perlmonks discussions. All that I did was add in some logging.

http://www.perlmonks.org/?node_id=151886 You can use the script at the top to test the runner. Just download it, make it executable, and put it in the infile as

perl command.pl 1 perl command.pl 2 #so on and so forth

AUTHOR

Jillian Rowe, <jillian.e.rowe at gmail.com>

BUGS

Please report any bugs or feature requests to bug-runner-init at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HPC-Runner-Slurmm. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

perldoc HPC::Runner::Slurm

You can also look for information at:

ACKNOWLEDGEMENTS

This module was originally developed at and for Weill Cornell Medical College in Qatar. With approval from WCMC-Q, this information was generalized and put on github, for which the authors would like to express their gratitude.

LICENSE AND COPYRIGHT

Copyright 2014 Jillian Rowe.

This program is free software; you can redistribute it and/or modify it under the terms of the the Artistic License (2.0). You may obtain a copy of the full license at:

http://www.perlfoundation.org/artistic_license_2_0

Any use, modification, and distribution of the Standard or Modified Versions is governed by this Artistic License. By using, modifying or distributing the Package, you accept this license. Do not use, modify, or distribute the Package, if you do not accept this license.

If your Modified Version has been derived from a Modified Version made by someone other than you, you are nevertheless required to ensure that your Modified Version complies with the requirements of this license.

This license does not grant you the right to use any trademark, service mark, tradename, or logo of the Copyright Holder.

This license includes the non-exclusive, worldwide, free-of-charge patent license to make, have made, use, offer to sell, sell, import and otherwise transfer the Package with respect to any patent claims licensable by the Copyright Holder that are necessarily infringed by the Package. If you institute patent litigation (including a cross-claim or counterclaim) against any party alleging that the Package constitutes direct or contributory patent infringement, then this Artistic License to you shall terminate on the date that such litigation is filed.

Disclaimer of Warranty: THE PACKAGE IS PROVIDED BY THE COPYRIGHT HOLDER AND CONTRIBUTORS "AS IS' AND WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES. THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, OR NON-INFRINGEMENT ARE DISCLAIMED TO THE EXTENT PERMITTED BY YOUR LOCAL LAW. UNLESS REQUIRED BY LAW, NO COPYRIGHT HOLDER OR CONTRIBUTOR WILL BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING IN ANY WAY OUT OF THE USE OF THE PACKAGE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.