NAME
PDL::Parallel::MPI
Routines to allow PDL objects to be moved around on parallel systems using the MPI library.
SYNOPSIS
use PDL;
use PDL::Parallel::MPI;
mpirun(2);
MPI_Init();
$rank = get_rank();
$a=$rank * ones(2);
print "my rank is $rank and \$a is $a\n";
$a->move( 1 => 0);
print "my rank is $rank and \$a is $a\n";
MPI_Finalize();
MPI STANDARD CALLS
Most of the functions from the MPI standard may be used from this module on regular perl data. This is functionallity inherited from the Parallel::MPI module. Read the documentation for Parallel::MPI to see how to use.
One may mix mpi calls on perl built-in-datatypes and mpi calls on piddles.
use PDL;
use PDL::Parallel::MPI;
mpirun(2);
MPI_Init();
$rank = get_rank();
$pi = 3.1;
if ($rank == 0) {
MPI_Send(\$pi,1,MPI_DOUBLE,1,0,MPI_COMM_WORLD);
} else {
$message = zeroes(1);
$message->receive(0);
print "pi is $message\n";
}
MPI_Finalize();
MPI GENERIC CALLS
MPI_Init
Call this before you pass around any piddles or make any mpi calls.
# usage:
MPI_Init();
MPI_Finalize
Call this before your mpi program exits or you may get zombies.
# usage:
MPI_Finalize();
get_rank
Returns an integer specifying who this process is. Starts at 0. Optional communicator argument.
# usage
get_rank($comm); # comm is optional and defaults to MPI_COMM_WORLD
comm_size
Returns an integer specifying how many processes there are. Optional communicator argument.
# usage
comm_size($comm); # comm is optional and defaults to MPI_COMM_WORLD
mpirun
Typically one would invoke a mpi program using mpirun, which comes with your mpi compiler. This function simply a dirty hack which makes a script to invoke itself using that program.
mpirun($number_of_processes);
PDL SPECIFIC MPI CALLS
move
move
is a piddle method. It copies a piddle from one processes onto another processes. The first arguement is the rank of the source processes, and the second argument is the rank of the receiving processes. The piddle should be the allocated to be same size and datatype on both machines (this is not checked). The method does nothing if executed on a process which is neither the source or the destination. => may be used in place of "," for readability.
# usage
$piddle->move($source_processor , $dest_processor);
# example
$a = $rank * ones(4);
$a->move( 0 => 1);
send / receive
You can use send and receive to move piddles around by yourself, although I really recommend using move
instead.
$piddle->send($dest,$tag,$comm); # $tag and $comm are optional
$piddle->receive($source,$tag,$comm); # dido
broadcast
Piddle method which copies the value at the root process to all of the other processes in the communicator. The root defaults to 0 if not specified and the communicator to MPI_COMM_WORLD. Piddles should be pre-allocated to be the same size and datatype on all processes.
# usage
$piddle->broadcast($root,$comm); # $root and $comm optional
# example
$a=$rank * ones(4);
$a->broadcast(3);
send_nonblocking / receive_nonblocking
These piddle methods initiate communication and return before that communication is completed. They return a request object which can be checked for completion or waited on. Data at source and dest should be pre-allocated to have the same size and datatype.
# $tag and $comm are optional arguments.
$request = $piddle->send_nonblocking($dest_proc,$tag,$comm);
$request = $piddle->receive_nonblocking($dest_proc,$tag,$comm);
...
$request->wait(); # blocks until the communication is completed.
or
$request->test(); # returns true if the communication is completed.
# $request is deallocated after a wait or test returns true.
# this example is similar to how mpi_rotate is implemented.
$r_send = $source->send_nonblocking(($rank+1) % $population);
$r_receive = $dest->receive_nonblocking(($rank-1) % $population);
$r_receive->wait();
$r_send->wait();
get_status / print_status
get_status returns a hashref which contains the status of the last receive. The fields are count
, source
, tag
, and error
. print_status simply prints out the status nicely. Note that if there is an error in a receive and exception will be thrown.
print_status();
print ${get_status()}{count};
mpi_rotate
mpi_rotate
is a piddle method which should be executed at the same time on all processors. For each process, it moves the entire piddle to the next process. This movement is (inefficently) done in place by default, or you can specify a destination.
$piddle->mpi_rotate(
dest => $dest_piddle, # optional
offset => $offset, # optional, defaults to +1
);
scatter
Takes a piddle and splits its data onto all of the processors. This would take an n-dimensional piddle on the root and turn it into an n-1 dimensional piddle on all processors. It may be called as a piddle method, which is equivilant to simply specifing the 'source' argument. On the root, one must specify the source. On all other procs, one may also pass a 'source' argument to allow scatter to grok the size of the destination piddle to allocate. Alternatively on the non-root procs one may specify the dest piddle explicitly, or simply specify the dimensions of the destionation piddle.
# usage (all arguments are optional, but see above).
# may be used as a piddle method, which simply sets the
# source argument.
$dest_piddle = scatter(
source => $src_piddle,
dest => $dest_piddle,
dims => $array_ref,
root => $root_proc, # root defaults to 0.
comm => $comm, # defaults to MPI_COMM_WORLD
);
# with 4 processes
$a = sequence(4,4);
$b = $a->scatter;
gather
gather
is the opposite of scatter
. Using it as a piddle method simply specifies the source. If called on an n dimensional piddle on all procs, the root will contain an n+1 dimensional piddle on completion.
memory =>
+------------+ +-------------+
^ |a0 | -----> | a0 a1 a2 a3 |
procs | |a1 | gather | |
| |a2 | | |
|a3 | <----- | |
+------------+ scatter +-------------+
# usage
gather(
source => $src_piddle,
dest => $dest_piddle, # only used at root, extrapolated from source if not specified.
root => $root_proc, # defaults to 0
comm => $comm, # defaults to MPI_COMM_WORLD
);
# example. assume nprocs == 4.
$a = ones(4);
$b = $a->gather;
# $b->dims now is (4,4) on proc 0.
allgather
allgather
does the same thing as gather
except that the result is placed on all processors rather than just the root.
memory =>
+------------+ +-------------+
^ |a0 | | a0 a1 a2 a3 |
procs | |a1 | -----> | a0 a1 a2 a3 |
| |a2 | allgather | a0 a1 a2 a3 |
|a3 | | a0 a1 a2 a3 |
+------------+ +-------------+
alltoall
memory =>
+-------------+ +-------------+
^ | a0 a1 a2 a3 | | a0 b0 c0 d0 |
procs | | b0 b1 b2 b3 | -----> | a1 b1 c1 d1 |
| | c0 c1 c2 c3 | alltoall | a2 b2 c2 d2 |
| d0 d1 d2 d3 | | a3 b3 c3 d3 |
+-------------+ +-------------+
# usage
# calling as piddle method simply sets the source argument.
$dest_piddle = alltoall(
source => $src_piddle,
dest => $dest_piddle, # created for you if not passed.
comm => $comm, # defaults to MPI_COMM_WORLD.
);
# example: assume comm_size is 4.
$a = $rank * sequence(4);
$b = $a->alltoall;
reduce
+-------------+ +----------------------------------+
| a0 a1 a2 a3 | reduce | a0+b0+c0+d0 , a1+b1+c1+d1, .... |
| b0 b1 b2 b3 | -----> | |
| c0 c1 c2 c3 | | |
| d0 d1 d2 d3 | | |
+-------------+ +----------------------------------+
Allowed operations are: + * max min & | ^ and or xor
.
# usage (also as piddle method; source is set)
$dest_piddle = reduce(
source => $src_piddle,
dest => $dest_piddle, # signifigant only at root & created for you if not specified
root => $root, # defaults to 0
op => $op, # defaults to '+'
comm => $comm, # defaults to MPI_COMM_WORLD
);
# example
$a=$rank * (sequence(4)+1);
$b=$a->reduce;
allreduce
Just like reduce except that the result is put on all the processes.
# usage (also as piddle method; source is set)
$dest_piddle = allreduce(
source => $src_piddle,
dest => $dest_piddle, # created for you if not specified
root => $root, # defaults to 0
op => $op, # defaults to '+'
comm => $comm, # defaults to MPI_COMM_WORLD
);
# example
$a=$rank * (sequence(4)+1);
$b=$a->allreduce;
scan
+-------------+ +----------------------------------+
| a0 a1 a2 a3 | scan | a0 , a1 , a2 , a3 |
| b0 b1 b2 b3 | -----> | a0+b0 , a1+b1 , a2+b2, a3+b3 |
| c0 c1 c2 c3 | | |
| d0 d1 d2 d3 | | ... |
+-------------+ +----------------------------------+
Allowed operations are: + * max min & | ^ and or xor
.
# usage (also as piddle method; source is set)
$dest_piddle = scan(
source => $src_piddle,
dest => $dest_piddle, # created for you if not specified
root => $root, # defaults to 0
op => $op, # defaults to '+'
comm => $comm, # defaults to MPI_COMM_WORLD
);
# example
$a=$rank * (sequence(4)+1);
$b=$a->scan;
reduce_and_scatter
Does a reduce followed by a scatter. A regular scatter distributes the data evenly over all processes, but with reduce_and_scatter you get to specify the distribution (if you want; defaults to uniform).
# usage (also as piddle method; source is set)
$dest_piddle = reduce_and_scatter(
source => $src_piddle,
dest => $dest_piddle, # created for you if not specified.
recv_count => $recv_count_piddle # 1D int piddle. put $r[$i] elements on proc $i.
op => $op, # defaults to '+'
comm => $comm, # defaults to MPI_COMM_WORLD
);
# example, taken from t/20_reduce_and_scatter.t
mpirun(4); MPI_Init(); $rank = get_rank();
$a=$rank * (sequence(4)+1);
$b=$a->reduce_and_scatter;
print "rank = $rank, b=$b\n" ;
WARNINGS
This module is still under development. Signifigant changes are expected. As things are *expected* to break somehow or another, there is no waranty, and the author does not take responsibility for the damage/distruction of your data/computer/sanity.
PLANS
indexing
Currently there is no support for any sort of indexing/dataflow/child piddles. Ideally one would like to say:
$piddle->diagonal(0,1)->move(0 => 1);
But currently one must say:
$tmp = $piddle->diagonal(0,1)->copy;
$tmp->move( 0 => 1);
$piddle->diagonal(0,1) .= $tmp;
I believe the former behavior to be possible to implement. I plan to do so once I reach a sufficent degree of enlightenment.
distributed data
One might wish to use their own personal massively parallel supercomputer interactively with the pdl shell (perldl). This would require master/slave interactions and a distributed data model. Such a project won't be started until after I finish PDL::Parallel::OpenMP.
AUTHOR
Darin McGill darin@ocf.berkeley.edu
If you find this module useful, please let me know. I very much appreciate bug reports, suggestions and other feedback.
ACKNOWLEDGEMENTS
This module is an extension of Parallel::MPI written by Josh Wilmes and Chris Stevens. Signifigant portions of code has been copied from Parallel::MPI verbatim. Used with permission. Josh and Chris did most of the work to make perl's built in datatypes (scalars, arrays) work with MPI. I rely heavily on their work for MPI intitialization and error handling.
The diagrams in this document were inspired by and are similar to diagrams found in MPI-The Complete Reference.
Sections of code from the main PDL distribution (such as header files, and code from PDL::CallExt.xs) were used extensively in development. Many thanks to the PDL developers for their help, and of course, for creating the PDL system in the first place.
SEE ALSO
The PDL::Parallel homepage: http://www.ocf.berkeley.edu/~darin/projects/superperl
The PDL home page: http://pdl.perl.org
The Perl module Parallel::MPI.
The Message Passing Interface: http://www.mpi-forum.org/
PDL::Parallel::OpenMP (under development).
COPYING
This module is free software. It may be modified and/or redistributed under the same terms as perl itself.