NAME
Net::Amazon::HadoopEC2::Cluster - Representation of Hadoop-EC2 cluster
SYNOPSIS
my $hadoop = Net::Amazon::HadoopEC2->new(
{
aws_account_id => 'my account',
aws_access_key_id => 'my key',
aws_secret_access_key => 'my secret',
}
);
my $cluster = $hadoop->launch_cluster(
{
naem => 'hadoop-ec2-cluster',
image_id => 'ami-b0fe1ad9' # hadoop-ec2 official image
slaves => 2,
key_name => 'gsg-keypair',
key_file => "$ENV{HOME}/.ssh/id_rsa-gsg-keypair',
}
);
$cluster->push_file(
{
files => ['map.pl', 'reduce.pl'],
destination => '/root/',
}
);
my $option = join(' ', qw(
-mapper map.pl
-reducer reduce.pl
-file map.pl
-file reduce.pl
)
);
my $result = $cluster->execute(
{
command => "$hadoop jar $streaming $option",
}
);
DESCRIPTION
A class Representing Hadoop-EC2 cluster
METHODS
new
Constructor. Normally Net::Amazon::HadoopEC2 calls this so you won't need to think about this.
launch_cluster ($hashref)
Launches hadoop-ec2 cluster. Returns Net::Amazon::HadoopEC2::Cluster instance itself when succeeded.
- image_id (required)
-
The image id (ami) of the cluster.
- key_name (optional)
-
The key name to use when launching cluster. the default is 'gsg-keypair'.
- key_file (required)
-
Location of the private key file associated with key_name.
- slaves (optional)
-
The number of slaves. The default is 2.
find_cluster
Finds hadoop-ec2 cluster. Returns Net::Hadoop::EC2::Cluster instance itself if found.
launch_slave ($hashref)
Launches hadoop-ec2 slave instance for this cluster. Returns Net::Hadoop::EC2::Cluster instance itself if succeeded. Arguments are:
terminate_cluster
Terminates all EC2 instances of this cluster. Returns Net::Amazon::EC2::TerminateInstancesResponse instance.
terminate_slaves ($hashref)
Terminates hadoop-ec2 slave instances of this cluster. Returns Net::Amazon::EC2::TerminateInstancesResponse instance. Arguments are:
- slaves (optional)
-
The number of slave instances to terminate. the default is the number of exisiting instances.
execute ($hashref)
Runs command on the master instance via ssh. Returns Net::Amazon::HadoopEC2::SSH::Response instance. This method is implemented in Net::Amazon::HadoopEC2::SSH and it's only wrapper of Net::SSH::Perl. Arguments are:
- command (required)
-
The command line to pass.
- stdin (optional)
-
String to pass to STDIN of the command.
push_files ($hashref)
Pushes local files to hadoop-ec2 master instance via ssh. This method is also implemented in Net::Amazon::HadoopEC2::SSH. Returns true if succeeded. Arguments are:
- files (required)
-
files to push. Accepts string or arrayref of strings.
- destination (required)
-
Destination of the files.
get_files ($hashref)
Gets files on the hadoop-ec2 master instance. This method is implemented in Net::Amazon::HadoopEC2::SSH. Returns true if succeeded. Arguments are:
- files (required)
-
files to get. String and arrayref of strings is ok.
- destination (required)
-
local path to place the files.
ATTRIBUTES
name
Name of the cluster.
key_file
The key name to use when launching cluster. the default is 'gsg-keypair'.
retry
Boolean whether EC2 api request retry or not.
map_tasks
MAX_MAP_TASKS to pass to the instances when boot.
reduce_tasks
MAX_REDUCE_TASKS to pass to the instances when boot.
compress
COMPRESS to pass to the instances when boot.
user_data
additional user data to pass to the instances when boot.
master_instance
Net::Amazon::EC2::RunningInstances instance of master instance.
slave_instances
Arrayref of Net::Amazon::EC2::RunningInstances instance of master instance.
AUTHOR
Nobuo Danjou nobuo.danjou@gmail.com