NAME
Net::Amazon::MechanicalTurk::Command::LoadHITs - Bulk Loading support for Amazon Mechancial Turk.
This module adds the loadHITs method to the Net::Amazon::MechanicalTurk class.
SYNOPSIS
# See the sample loadHITs from the source distribution.
sub questionTemplate {
my %params = %{$_[0]};
return <<END_XML;
<?xml version="1.0" encoding="UTF-8"?>
<QuestionForm xmlns="http://mechanicalturk.amazonaws.com/AWSMechanicalTurkDataSchemas/2005-10-01/QuestionForm.xsd">
<Question>
<QuestionIdentifier>1</QuestionIdentifier>
<QuestionContent>
<Text>$params{question}</Text>
</QuestionContent>
<AnswerSpecification>
<FreeTextAnswer/>
</AnswerSpecification>
</Question>
</QuestionForm>
END_XML
}
my $properties = {
Title => 'LoadHITs Perl sample',
Description => 'This is a test of the bulk loading API.',
Keywords => 'LoadHITs, bulkload, perl',
Reward => {
CurrencyCode => 'USD',
Amount => 0.01
},
RequesterAnnotation => 'test',
AssignmentDurationInSeconds => 60 * 60,
AutoApprovalDelayInSeconds => 60 * 60 * 10,
MaxAssignments => 3,
LifetimeInSeconds => 60 * 60
};
my $mturk = Net::Amazon::MechanicalTurk->new;
$mturk->loadHITs(
properties => $properties,
input => "loadhits-input.csv",
question => \&questionTemplate,
progress => \*STDOUT,
success => "loadhits-success.csv",
fail => "loadhits-failure.csv"
);
loadHITs
loadHITs
Bulk loads many hits of the same hit type into mechanical turk. The method takes a set of properties used to create a HITType and its associated HITs. To generate questions for HITs, rows of data are pulled from an input source which is merged against a question template to generate the question xml. For each row in an input source, 1 HIT is generated. Note: The source distribution of the Mechanical Turk Perl SDK contains samples using this method.
loadHITs takes a hash reference or a hash with the following parameters:
properties - (required) Either a hash reference or the name of a file,
containing the properties to use for generating a HITType
and the associated HITs. When the properties are read from
a file, the method
Net::Amazon::MechanicalTurk::Properties->readNestedData is
used.
input - (required) The input source for row data.
This parameter may be of the following types:
- Net::Amazon::MechanicalTurk::RowData
- An array of hashes.
(This is internally converted into an object of type:
Net::Amazon::MechanicalTurk::RowData::ArrayHashRowData)
- A reference to a subroutine. When the loadHITs method
asks for row data, the subroutine will be called and
passed a subroutine reference, which should be called
for every row generated by the input. The generated row
should be a hash reference.
(This is internally converted into an object of type
Net::Amazon::MechanicalTurk::RowData::SubroutineRowData)
- The name of a file. The file should be either a CSV or
tab delimited file. If the file name ends with '.csv',
it will read as a CSV, otherwise it is assumed to be
tab delimited. The first row in the file should contain
the column names. Each subsequent row becomes a hash
reference based on the column names.
(This is internally converted into an object of type
Net::Amazon::MechanicalTurk::RowData::DelimitedRowData)
question - (required) The question template used to generate questions.
This parameter may be of the following types:
- An object of type Net::Amazon::MechanicalTurk::Template.
- A subroutine. The subroutine will be given a hash
reference representing the current input row.
(This is internally converted into an object of type
Net::Amazon::MechanicalTurk::Template::SubroutineTemplate)
- A filename ending in .rt or .question. This is a text
file which contains variables, which will be substituted
from the input row. Variables in the text file have
the syntax ${var_name}.
- A filename ending in .pl. This is a perl script, which
has 2 variables set named %params and $out. %params are
the parameters representing the input row and $out is
the IO::Handle the question should be written to. Before
this script is invoked, the $out handle is selected as
the default handle, so calls to print and printf without
a handle, will go to $out.
Note: Use of this type of question, requires the
IO::String module.
preview - (optional) If preview is specified, a HITType and no HITs
will be created, instead, the preview parameter will be
given the parameters that would be used create the HIT.
This parameter may be of the following types:
- A subroutine. The subroutine is called with the
CreateHIT parameters.
- An IO::Handle. Each question from the CreateHIT
parameters will be printed to the handle.
- The name of a file. Each question from the CreateHIT
parameters will be printed to the file.
progress - (optional) Used to display progress messages. This
parameter may be of the following types:
- A subroutine. The subroutine is called with 1 parameter,
a message to be displayed.
- An IO::Handle. The progress message is written to the
handle.
success - (optional) Used to handle a successfully created hit. This
parameter may be of the following types:
- A filename. HITId's and HITTypeId's will be written to
this file. The file will be in a delimited format,
with the first row containing column headers. If the
filename ends in ".csv" the file format will be CSV,
otherwise it will be tab delimited.
- A subroutine. The subroutine is called when a hit is
created and passed a hash with the following parameters:
- mturk - A handle to the mturk client.
- fields - An array reference of the field names
for the input row.
- row - The input row the hit was created
from.
- parameters - The parameters given to CreateHIT.
- HITId - The HITId created.
- HITTypeId - The HITTypeId of the hit created.
fail - (optional) Used to handle a hit which failed creation. If
this value is not specified and a hit fails creation, an
error will be raised. This value may be of the following
types:
- A filename. The input row will be written back to the
file in a delimited format. If the file name ends with
".csv", then the file will be in CSV format, otherwise
it will be in a tab delimited format.
- A subroutine. The subroutine will be called back with
a hash containing the following values:
- mturk - A handle to the mturk client.
- fields - An array reference of the field names
for the input row.
- row - The input row the hit was created
from.
- parameters - The parameters given to CreateHIT.
- HITTypeId - The HITTypeId that was used in the
CreateHIT call.
- error - The error message associated with
the failure.
maxHits - (optional) If this value is greater than 0, than at most
maxHits will be created.
entityEscapeInput - (optional) If this value is a true value then the
input row will have certain values encoded as xml
entities, before being passed to the template.
The unescaped values will be accessible as <key>_raw.
The characters escaped are >, <, &, ' and ".
This parameter is on by default.