NAME
PBS::Client - Job submission interface of the PBS (Portable Batch System)
SYNOPSIS
# Load this module
use PBS::Client;
# Create a client object linked to a server
my $client = PBS::Client->new;
# Discribe the job
my $job = PBS::Client::Job->new(
%job_options, # e.g. queue => 'queue_1', mem => '800mb'
cmd => \@commands
);
# Optionally, re-organize the commands to a number of queues
$job->pack(numQ => $numQ);
# Submit job
$client->qsub($job);
DESCRIPTION
This module lets you submit jobs to the PBS server in Perl. It would be especially useful when you submit a large amount of jobs. Inter-dependency among jobs can also be declared.
SIMPLE USAGE
To submit PBS jobs using PBS::Client, there are basically three steps:
- 1 Create a client object using
new()
, e.g., -
my $client = PBS::Client->new;
- 2 Create a job object using
new()
and specify the commands to be submitted using optioncmd
, e.g., -
my $job = PBS::Client::Job->new(cmd => \@commands);
- 3 Use the
qsub()
method of the client object to submit the jobs, e.g., -
$client->qsub($job);
There are other methods and options of the client object and job object. However, most of them may appear to be too difficult for the first use. The only must option is cmd
which tells the client object what need to be submitted. Other options are optional. If omitted, default values are used.
CLIENT OBJECT METHODS
new()
$pbs = PBS::Client->new(
server => $server # PBS server name (optional)
);
Client object is created by the new
method. The name of the PBS server can by optionally supplied. If it is omitted, default server is assumed.
qsub()
Job (as a job object) is submitted to PBS by the method qub
.
my $pbsid = $pbs->qsub($job_object);
An array reference of PBS job ID would be returned.
JOB OBJECT METHODS
new()
$job = PBS::Client::Job->new(
# Job declaration options
wd => $wd, # working directory, default: cwd
name => $name, # job name, default: pbsjob.sh
script => $script, # job script name, default: pbsjob.sh
account => $account, # account string
# Resources options
partition => $partition, # partition
queue => $queue, # queue
begint => $begint, # beginning time
host => $host, # host used to execute
nodes => $nodes, # execution nodes, default: 1
ppn => $ppn, # process per node
pri => $pri, # priority
nice => $nice, # nice value
mem => $mem, # requested total memory
pmem => $pmem, # requested per-process memory
vmem => $vmem, # requested virtual memory
pvmem => $pvmem, # requested per-process virtual memory
cput => $cput, # requested total CPU time
pcput => $pcput, # requested per-process CPU time
wallt => $wallt, # requested wall time
# IO options
stagein => $stagein, # files staged in
stageout => $stageout, # files staged out
ofile => $ofile, # standard output file
efile => $efile, # standard error file
# Command options
cmd => [@commands], # command to be submitted
prev => {
ok => $job1, # successful job before $job
fail => $job2, # failed job before $job
start => $job3, # started job before $job
end => $job4, # ended job before $job
},
next => {
ok => $job5, # next job after $job succeeded
fail => $job6, # next job after $job failed
start => $job7, # next job after $job started
end => $job8, # next job after $job ended
},
# Job tracer options
tracer => $on, # job tracer, either on / off (default)
tfile => $tfile, # tracer report file
);
Two points may be noted:
- 1 Except
cmd
, all attributes are optional. - 2 All attributes can also be modified by methods, e.g.,
-
$job = PBS::Client::Job->new(cmd => [@commands]);
is equivalent to
$job = PBS::Client::Job->new; $job->cmd([@commands]);
Job Declaration Options
wd
Full path of the working directory, i.e. the directory where the command(s) is executed. The default value is the current working directory.
name
Job name. It can have 15 or less characters. It cannot contain space and the first character must be alphabetic. If not specified, it would follow the script name.
script
Filename prefix of the job script to be generated. The PBS job ID would be appended to the filename as the suffix.
Example: script => test.sh
would generate a job script like test.sh.12345 if the job ID is '12345'.
The default value is pbsjob.sh
.
account
Account string. This is meaningful if you need to which account you are using to submit the job.
Resources Options
partition
Partition name. This is meaningful only for the clusters with partitions. If it is omitted, default value will be assumed.
queue
Queue of which jobs are submitted to. If omitted, default queue would be used.
begint (Experimental)
The date-time at which the job begins to queue. The format is either "[[[[CC]YY]MM]DD]hhmm[.SS]" or "[[[[CC]YY-]MM-]DD] hh:mm[:SS]".
This feature is in Experimental phase. It may not be supported in later versions.
host
You can specify the host on which the job will be run.
nodes
Nodes used. It can be an integer (declaring number of nodes used), string (declaring which nodes are used), array reference (declaring which nodes are used), and hash reference (declaring which nodes, and how many processes of each node are used).
Examples:
Integer
nodes => 3
means that three nodes are used.
String / array reference
# string representation nodes => "node01 + node02" # array representation nodes => ["node01", "node02"]
means that nodes "node01" and "node02" are used.
Hash reference
nodes => {node01 => 2, node02 => 1}
means that "node01" is used with 2 processes, and "node02" with 1 processes.
ppn
Maximum number of processes per node. The default value is 1.
pri
Priority of the job in queueing. The higher the priority is, the shorter is the queueing time. Priority must be an integer between -1024 to +1023 inclusive. The default value is 0.
nice
Nice value of the job during execution. It must be an integer between -20 (highest priority) to 19 (lowest). The default value is 10.
mem
Maximum physical memory used by all processes. Unit can be b (bytes), w (words), kb, kw, mb, mw, gb or gw. If it is omitted, default value will be used. Please see also pmem
, vmem
and pvmem
.
pmem
Maximum per-process physical memory. Unit can be b (bytes), w (words), kb, kw, mb, mw, gb or gw. If it is omitted, default value will be used. Please see also mem
, vmem
and pvmem
.
vmem
Maximum virtual memory used by all processes. Unit can be b (bytes), w (words), kb, kw, mb, mw, gb or gw. If it is omitted, default value will be used. Please see also mem
, pmem
and pvmem
.
pvmem
Maximum virtual memory per processes. Unit can be b (bytes), w (words), kb, kw, mb, mw, gb or gw. If it is omitted, default value will be used. Please see also mem
, pmem
and vmem
.
cput
Maximum amount of total CPU time used by all processes. Values are specified in the form [[hours:]minutes:]seconds[.milliseconds]. Please see also pcput
.
pcput
Maximum amount of per-process CPU time. Values are specified in the form [[hours:]minutes:]seconds[.milliseconds]. Please see also cput
.
wallt
Maximum amount of wall time used. Values are specified in the form [[hours:]minutes:]seconds[.milliseconds].
IO Options
stagein
Specify which files are need to stage (copy) in before the job starts. It may be a string, array reference or hash reference. For example, to stage in from01.file and from02.file in the remote host "fromMachine" and rename to01.file and to02.file respectively, following three representation are equilvalent:
String
stagein => "to01.file@fromMachine:from01.file,". "to02.file@fromMachine:from02.file"
Array
stagein => ['to01.file@fromMachine:from01.file', 'to02.file@fromMachine:from02.file']
Hash
stagein => {'fromMachine:from01.file' => 'to01.file', 'fromMachine:from02.file' => 'to02.file'}
stageout
Specify which files are need to stage (copy) out after the job finishs. Same as stagein
, it may be string, array reference or hash reference.
Examples:
String
stageout => "from01.file@toMachine:to01.file,". "from02.file@toMachine:to02.file"
Array
stageout => ['from01.file@toMachine:to01.file', 'from02.file@toMachine:to02.file']
Hash
stageout => {'from01.file' => 'toMachine:to01.file', 'from02.file' => 'toMachine:to02.file'}
ofile
Path of the file for standard output. The default filename is like jobName.o12345 if the job name is 'jobName' and its ID is '12345'. Please see also efile
.
efile
Path of the file for standard error. The default filename is like jobName.e12345 if the job name is 'jobName' and its ID is '12345'. Please see also ofile
.
Command Options
cmd
Command(s) to be submitted. It can be an array (2D or 1D) reference or a string. For 2D array reference, each row would be a separate job in PBS, while different elements of the same row are commands which would be executed one by one in the same job. For 1D array, each element is a command which would be submitted separately to PBS. If it is a string, it is assumed that the string is the only one command which would be executed.
Examples:
2D array reference
cmd => [["./a1.out"], ["./a2.out" , "./a3.out"]]
means that
a1.out
would be excuted as one PBS job, whilea2.out
anda3.out
would be excuted one by one in another job.1D array reference
cmd => ["./a1.out", "./a2.out"]
means that
a1.out
would be executed as one PBS job anda2.out
would be another. Therefore, this is equilvalent tocmd => [["./a1.out", "./a2.out"]]
String
cmd => "./a.out"
means that the command
a.out
would be executed. Equilvalently, it can becmd => [["./a.out"]] # as a 2D array # or cmd => ["./a.out"] # as a 1D array.
prev
Hash reference which declares the job(s) executed beforehand. The hash can have four possible keys: start
, end
, ok
and fail
. start
declares job(s) which has started execution. end
declares job(s) which has already ended. ok
declares job(s) which has finished successfully. fail
declares job(s) which failed. Please see also next
.
Example: $job1->prev({ok => $job2, fail => $job3})
means that $job1
is executed only after $job2
exits normally and job3
exits with error.
next
Hash reference which declares the job(s) executed later. The hash can have four possible keys: start
, end
, ok
and fail
. start
declares job(s) after started execution. end
declares job(s) after finished execution. ok
declares job(s) after finished successfully. fail
declares job(s) after failure. Please see also prev
.
Example: $job1->next({ok => $job2, fail => $job3})
means that $job2
would be executed after $job1
exits normally, and otherwise job3
would be executed instead.
Job Tracer Options
tracer (Experimental)
Trace when and where the job was executing. It takes value of either on or off (default). If it is turned on, an extra tracer report file would be generated. It records when the job started, where it ran, when it finished and how long it used.
This feature is in Experimental phase. It may not be supported in later versions.
tfile (Experimental)
Path of the tracer report file. The default filename is like jobName.t12345 if the job name is 'jobName' and its ID is '12345'. Please see also ofile
and efile
.
This feature is in Experimental phase. It may not be supported in later versions.
pbsid
Return the PBS job ID(s) of the job(s). It returns after the job(s) has submitted to the PBS. The returned value is an integer if cmd
is a string. If cmd
is an array reference, the reference of the array of ID will be returned. For example,
$pbsid = $job->pbsid;
pack()
pack
is used to rearrange the commands among different queues (PBS jobs). Two options, which are numQ
and cpq
can be set. numQ
specifies number of jobs that the commands will be distributed. For example,
$job->pack(numQ => 8);
distributes the commands among 8 jobs. On the other hand, the cpq
(abbreviation of command per queue) option rearranges the commands such that each job would have specified commands. For example,
$job->pack(cpq => 8);
packs the commands such that each job would have 8 commands, until no command left.
copy()
Job objects can be copied by the copy
method:
my $new_job = $old_job->copy;
The new job object ($new_job
) is identical to, but independent of the original job object ($old_job
).
copy
can also specify number of copies to be generated. For example,
my @copies = $old_job->copy(3);
makes three identical copies.
Hence, the following two statements are the same:
my $new_job = $old_job->copy;
my ($new_job) = $old_job->copy(1);
SCENARIOS
- 1. Submit a Single Command
-
You want to run
a.out
of current working directory in 'delta' queueuse PBS::Client; my $pbs = PBS::Client->new; my $job = PBS::Client::Job->new( cmd => './a.out', );
- 2. Submit a List of Commands
-
You need to submit a list of commands to PBS. They are stored in the Perl array
@jobs
. You want to execute them one by one in a single CPU.use PBS::Client; my $pbs = PBS::Client->new; my $job = PBS::Client::Job->new( cmd => [\@jobs], ); $pbs->qsub($job);
- 3. Submit Multiple Lists
-
You have 3 groups of commands, stored in
@jobs_a
,@jobs_b
,@jobs_c
. You want to execute each group in different CPU.use PBS::Client; my $pbs = PBS::Client->new; my $job = PBS::Client::Job->new( cmd => [ \@jobs_a, \@jobs_b, \@jobs_c, ], ); $pbs->qsub($job);
- 4. Rearrange Commands (Specifying Number of Queues)
-
You have 3 groups of commands, stored in
@jobs_a
,@jobs_b
,@jobs_c
. You want to re-organize them to 4 groups.use PBS::Client; my $pbs = PBS::Client->new; my $job = PBS::Client::Job->new( cmd => [ \@jobs_a, \@jobs_b, \@jobs_c, ], ); $job->pack(numQ => 4); $pbs->qsub($job);
- 5. Rearrange Commands (Specifying Commands Per Queue)
-
You have 3 groups of commands, stored in
@jobs_a
,@jobs_b
,@jobs_c
. You want to re-organize such that each group has 4 commands.use PBS::Client; my $pbs = PBS::Client->new; my $job = PBS::Client::Job->new( cmd => [ \@jobs_a, \@jobs_b, \@jobs_c, ], ); $job->pack(cpq => 4); $pbs->qsub($job);
- 6. Customize resource
-
You want to use customized resource rather than the default resource allocation.
use PBS::Client; my $pbs = PBS::Client->new; my $job = Kode::PBS::Job->new( account => 'my_project', # account string partition => 'partition01', # partition name queue => 'queue01', # PBS queue name wd => '/tmp', # working directory name => 'testing', # job name script => 'test.sh', # name of script generated pri => '10', # higher priority mem => '800mb', # 800 MB memory cput => '10:00:00', # 10 hrs CPU time wallt => '05:00:00', # 5 hrs wall time cmd => './a.out --debug', # command line ); $pbs->qsub($job);
- 7. Job dependency
-
You want to run
a1.out
. Then runa2.out
ifa1.out
finished successfully; otherwise runa3.out
anda4.out
.use PBS::Client; my $pbs = PBS::Client->new; my $job1 = PBS::Client::Job->new(cmd => "./a1.out"); my $job2 = PBS::Client::Job->new(cmd => "./a2.out"); my $job3 = PBS::Client::Job->new(cmd => ["./a3.out", "./a4.out"]); $job1->next({ok => $job2, fail => $job3}); $pbs->qsub($job1);
SCRIPT "RUN"
If you want to execute a single command, you need not write script. The simplest way is to use the script run in this package. For example,
run "./a.out --debug > a.dat"
would submit the job executing the command "a.out" with option "--debug", and redirect the output to the file "a.dat".
The options of the job object, such as the resource requested can be edited by
run -e
The more detail manual can be viewed by
run -m
REQUIREMENTS
Class::MethodMaker
BUGS
Perhaps many. Bugs and suggestions please email to kwmak@cpan.org
SEE ALSO
PBS offical website http://www.openpbs.com,
PBS
AUTHOR(S)
Ka-Wai Mak <kwmak@cpan.org>
COPYRIGHT
Copyright (c) 2006 Ka-Wai Mak. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.