NAME
HPCD::SGE
VERSION
Version 0.50
SYNOPSIS
use HPCI;
my $group = HPCI->group(cluster => 'SGE', ...);
DESCRIPTION
This is a driver module that can be used internally from HPCI, by specifying a cluster type of 'SGE'. It customizes the HPCI access interface to control jobs submitted to an SGE (Sun Grid Engine) cluster.
The differences provided by the SGE driver are:
- -
-
the following resource names have sge-specific aliases
Using these aliases causes your code to be SGE-specific. That makes converting it to use an alternate type of cluster more painful. However, if you do not expect to ever be moving your code to a different cluster type, this might simplify the interface for people who already know the SGE-specific resource names and wish to avoid the confusion that might come from using the generic names.
- -
-
a stage attribute extra_sge_args_string is available which specifies a string of argument(s) that will be included in the qsub command that starts execution of that stage
- -
-
The name attribute for a stage will be filtered to only allow letters, digits, dash, dot, and underscore. After this filtering, the name must still be unique within the group. (This is a requirement for names submitted to the SGE system.)
- -
-
The result information hash returned by
$stats = $group->execute
includes all of the info provided by the SGE qacct command, not just the exit status. - -
-
If a stage is terminated because it exceeds the requested memory resource limit, then it will normally be retried using a higher limit. The default is to first try with '2G' (if no h_mem or mem resource limit is explicitly specified). After mem failures, the next larger size in the default sequence (2G 4G 8G 16G 32G) is attempted. The detection for exceeding the memory requirement is concoluted, because SGE does not return a clear indiciation that it terminated the job. HPCI considers it to have been a memory overrun if the memory usage exceeds a threshold (default is 99% of requested memory allocation, but the attribute retry_mem_percent can be specified to provide an alternate range), or if one of a few common memory allocation failure requests is seen.
SEE ALSO
- HPCI
-
Describes the generic HPCI iterface - this manual is only providing the exceptions to that document (and its related documents).
- HPCI::Group
-
Describes the interface common to all HPCI Group objects, regardless of the particular type of cluster that is actually being used to run the stages.
- HPCI::Stage
-
Describes the interface common to stage object returned by all HPCI Stage objects, regardless of the particular type of cluster that is actually being used to run the stages.
AUTHOR
Christopher Lalansingh - Boutros Lab
John Macdonald - Boutros Lab
ACKNOWLEDGEMENTS
Paul Boutros, Phd, PI - Boutros Lab
The Ontario Institute for Cancer Research