NAME
Amazon::MWS::Uploader -- high level agent to upload products to AMWS
DESCRIPTION
This module provide an high level interface to the upload process. It has to keep track of the state to resume the uploading, which could get stuck on the Amazon's side processing, so database credentials have to be provided (or the database handle itself).
The table structure needed is defined and commented in sql/amazon.sql
SYNOPSIS
my $agent = Amazon::MWS::Uploader->new(
db_dsn => 'DBI:mysql:database=XXX',
db_username => 'xxx',
db_password => 'xxx',
db_options => \%options
# or dbh => $dbh,
schema_dir => '/path/to/xml_schema',
feed_dir => '/path/to/directory/for/xml',
merchant_id => 'xxx',
access_key_id => 'xxx',
secret_key => 'xxx',
marketplace_id => 'xxx',
endpoint => 'xxx',
products => \@products,
);
# say once a day, retrieve the full batch and send it up
$agent->upload;
# every 10 minutes or so, continue the work started with ->upload, if any
$agent->resume;
UPGRADE NOTES
When migrating from 0.05 to 0.06 please execute this SQL statement
ALTER TABLE amazon_mws_products ADD COLUMN listed BOOLEAN;
UPDATE amazon_mws_products SET listed = 1 WHERE status = 'ok';
When upgrading to 0.16, please execute this SQL statement:
ALTER TABLE amazon_mws_products ADD COLUMN warnings TEXT;
ACCESSORS
The following keys must be passed at the constructor and can be accessed read-only:
- dbh
-
The DBI handle. If not provided will be built using the following self-describing accessor:
- db_dsn
- db_username
- db_password
- db_options
-
E.g.
{ mysql_enable_utf8 => 1, }
AutoCommit and RaiseError are set by us.
- skus_warnings_modes
-
Determines how to treat warnings. This is a hash reference with the code of the warning as key and one of the following modes as value:
- warn
-
Prints warning from Amazon with
warn
function (default mode). -
Prints warning from Amazon with
print
function (default mode). - skip
-
Ignores warning from Amazon.
- order_days_range
-
When calling get_orders, check the orders for the last X days. It accepts an integer which should be in the range 1-30. Defaults to 7.
Keep in mind that if you change the default and you have a lot of orders, you will get throttled because for each order we retrieve the orderline as well.
DEVEL NOTE: a possible smart fix would be to store this object in the order (or into a closure) and make the orderline a lazy attribute which will call
ListOrderItems
. - shop_id
-
You can pass an arbitrary identifier to the constructor which will be used to keep the database records separated if you have multiple amazon accounts. If not provided, the merchant id will be used, which will work, but it's harder (for the humans) to spot and debug.
- debug
-
Print out additional information.
- logfile
-
Passed to Amazon::MWS::Client constructor.
- quiet
-
Boolean. Do not warn on timeouts and aborts (just print) if set to true.
- purge_missing_products
-
If true, the first time
products_to_upload
is called, products not passed to theproducts
constructor will be purged from theamazon_mws_products
table. Default to false.This setting is DEPRECATED because can have some unwanted side-effects. You are recommended to delete the obsoleted products yourself.
- reset_all_errors
-
If set to a true value, don't skip previously failed items and effectively reset all of them.
Also, when the accessor is set for send_shipping_confirmation, try to upload again previously failed orders.
- reset_errors
-
A string containing a comma separated list of error codes, optionally prefixed with a "!" (to reverse its meaning).
Example:
"!6024,6023"
Meaning: reupload all the products whose error code is not 6024 or 6023.
"6024,6023"
Meaning: reupload the products whose error code was 6024 or 6023
- force
-
Same as above, but only for the selected items. An arrayref is expected here with the skus.
- limit_inventory
-
If set to an integer, limit the inventory to this value. Setting this to 0 will disable it.
- job_hours_timeout
-
If set to an integer, abort the job after X hours are elapsed since the job was started. Default to 3 hours. Set to 0 to disable (not recommended).
This doesn't affect jobs for order acknowledgements (
order_ack
), see below. - order_ack_days_timeout
-
Order acknowlegments times out at different rate, because it's somehow sensitive.
- schema_dir
-
The directory where the xsd files for the feed building can be found.
- feeder
-
A Amazon::MWS::XML::Feed object. Lazy attribute, you shouldn't pass this to the constructor, it is lazily built using
products
,merchant_id
andschema_dir
. - feed_dir
-
A working directory where to stash the uploaded feeds for inspection if problems are detected.
- schema
-
The XML::Compile::Schema object, built lazily from
feed_dir
- xml_writer
-
The xml writer, built lazily.
- xml_reader
-
The xml reader, built lazily.
- generic_feeder
-
Return a Amazon::MWS::XML::GenericFeed object to build a feed using the XML writer.
- merchant_id
-
The merchant ID provided by Amazon.
- access_key_id
-
Provided by Amazon.
- secret_key
-
Provided by Amazon.
- marketplace_id
-
http://docs.developer.amazonservices.com/en_US/dev_guide/DG_Endpoints.html
- endpoint
-
Ditto.
- products
-
An arrayref of Amazon::MWS::XML::Product objects, or anything that (properly) responds to
as_product_hash
,as_inventory_hash
,as_price_hash
. See Amazon::MWS::XML::Product for details.This is set as read-write, so you can set the product after the object construction, but if you change it afterward, you will get unexpected results.
This routine also check if the product needs upload and delete disappeared products. If you are doing the check yourself, use
checked_products
. - checked_products
-
As
products
, but no check is performed. This takes precedence. - sqla
-
Lazy attribute to hold the
SQL::Abstract
object. - client
-
An Amazon::MWS::Client object, built lazily, so you don't have to pass it.
MAIN METHODS
upload
If the products is set, begin the routine to upload them. Because of the asynchronous way AMWS works, at some point it will bail out, saving the state in the database. You should reinstantiate the object and call resume
on it every 10 minutes or so.
The workflow is described here: http://docs.developer.amazonservices.com/en_US/feeds/Feeds_Overview.html
This has to be done for each feed: Product, Inventory, Price, Image, Relationship (for variants).
This method first generate the feeds in the feed directory, and then calls resume
, which is in charge for the actual uploading.
resume
Restore the state and resume where it was left.
This method accepts an optional list of parameters. Each parameter may be:
- a scalar
-
This is considered a job id.
- a hashref
-
This will be merged in the query to retrieve the pending jobs. A sample usage could be:
$upload->resume({ task => [qw/upload product_deletion/] });
to resume only those specific tasks.
get_pending_jobs
Return the list of hashref with the pending jobs out of the database. Accepts the same parameters as resume
(which actually calls this method).
INTERNAL METHODS
prepare_feeds($type, { name => $feed_name, content => "<xml>..."}, { name => $feed_name2, content => "<xml>..."}, ....)
Prepare the feed of type $type with the feeds provided as additional arguments.
Return the job id
cancel_job($task, $job_id, $reason)
Abort the job setting the aborted flag in amazon_mws_jobs
table.
process_feeds(\%job_row)
Given the hashref with the db row of the job, check at which point it is and resume.
upload_feed($type, $feed_id);
Routine to upload the feed. Return true if it's complete, false otherwise.
submission_result($feed_id)
Return a Amazon::MWS::XML::Response::FeedSubmissionResult object for the given feed ID.
get_orders($from_date)
This is a self-contained method and doesn't require a product list. The from_date must be a DateTime object. If not provided, it will the last week.
Returns a list of Amazon::MWS::XML::Order objects.
Beware that it's possible you get some items with 0 quantity, i.e. single items cancelled. The application code needs to be prepared to deal with such phantom items. You can check each order looping over $order-
items> checking for $item-
quantity>.
order_already_registered($order)
Check in the amazon_mws_orders table if we already registered this order.
Return the row for this table (as an hashref) if present, nothing underwise.
acknowledge_successful_order(@orders)
Accept a list of Amazon::MWS::XML::Order objects, prepare a acknowledge feed with the Success
status, and insert the orders in the database.
acknowledge_feed($status, @orders)
The first argument is usually Success
. The other arguments is a list of Amazon::MWS::XML::Order objects.
delete_skus(@skus)
Accept a list of skus. Prepare a product_deletion
feed and update the database.
delete_skus_feed(@skus)
Prepare a feed (via create_feed
) to delete the given skus.
register_errors($job_id, $result)
The first argument is the job ID. The second is a Amazon::MWS::XML::Response::FeedSubmissionResult object.
This method will update the status of the products (either failed
or redo
) in amazon_mws_products
.
register_order_ack_errors($job_id, $result);
Same arguments as above, but for order acknowledgements.
register_ship_order_errors($job_id, $result);
Same arguments as above, but for shipping notifications.
skus_in_job($job_id)
Check the amazon_mws_product for the SKU which were uploaded by the given job ID.
get_asin_for_eans(@eans)
Accept a list of EANs and return an hashref where the keys are the eans passed as arguments, and the values are the ASIN for the current marketplace. Max EANs: 5.x
http://docs.developer.amazonservices.com/en_US/products/Products_GetMatchingProductForId.html
get_asin_for_skus(@skus)
Same as above (with the same limit of 5 items), but for SKUs.
get_asin_for_sku($sku)
Same as above, but for a single sku. Return the ASIN or undef if not found.
get_asin_for_ean($ean)
Same as above, but for a single ean. Return the ASIN or undef if not found.
get_product_category_data($ean)
Return the deep data structures returned by GetProductCategoriesForASIN
.
get_product_categories($ean)
Return a list of category codes (the ones passed to RecommendedBrowseNode) which exists on amazon.
get_product_category_names($ean)
Return a list of arrayrefs with the category paths. Beware that we strip the first two parents, which euristically appear meaningless (Category/Category).
If this is not a case, please report this as a bug and we'll find a solution.
You can call get_product_category_data
to inspect the raw response yourself.
get_lowest_price_for_asin($asin, $condition)
Return the lowest price for asin, excluding ourselves. The second argument, condition, is optional and defaults to "New".
If you need the full details, you have to call $self->client->GetLowestOfferListingsForASIN yourself and make sense of the output. This method is mostly a wrapper meant to simplify the routine.
If we can't get any info, just return undef.
Return undef if no prices are found.
get_lowest_price_for_ean($ean, $condition)
Same as above, but use the EAN instead
shipping_confirmation_feed(@shipped_orders)
Return a feed string with the shipping confirmation. A list of Amazon::MWS::XML::ShippedOrder object must be passed.
send_shipping_confirmation($shipped_orders)
Schedule the shipped orders (an Amazon::MWS::XML::ShippedOrder object) for the uploading.
order_already_shipped($shipped_order)
Check if the shipped orders (an Amazon::MWS::XML::ShippedOrder was already notified as shipped looking into our table, returning the row with the order.
To see the status, check shipping_confirmation_ok (already done), shipping_confirmation_error (faulty), shipping_confirmation_job_id (pending).
orders_waiting_for_shipping
Return a list of hashref with two keys, amazon_order_id
and shop_order_id
for each order which is waiting confirmation.
This is implemented looking into amazon_mws_orders where there is no shipping confirmation job id.
The confirmed flag (which means we acknowledged the order) is ignored to avoid stuck order_ack jobs to prevent the shipping confirmation.
product_needs_upload($sku, $timestamp)
Lookup the product $sku with timestamp $timestamp and return the sku if the product needs to be uploaded or can be safely skipped. This method is stateless and doesn't alter anything.
orders_in_shipping_job($job_id)
Lookup the amazon_mws_orders
table and return a list of amazon_order_id
for the given shipping confirmation job. INTERNAL.
put_product_on_error(sku => $sku, timestamp_string => $timestamp, error_code => $error_code, error_msg => $error)
Register a custom error for the product $sku with error $error and $timestamp as the timestamp string. The error is optional, and will be "shop error" if not provided. The error code will be 1 if not provided.
cancel_feed($feed_id)
Call the CancelFeedSubmissions API and abort the feed and the belonging job if found in the list. Return the response, which probably is not even meaningful. It is a big FeedSubmissionInfo with the past feed submissions.
update_amw_order_status($amazon_order_number)
Check the order status on Amazon and update the row in the amazon_mws_orders table.
get_products_with_error_code(@error_codes)
Return a list of hashref with the rows from amazon_mws_products
for the current shop and the error code passed as argument. If no error codes are passed, fetch all the products in error.
get_products_with_warnings
Returns a list of hashref, with sku
and warnings
as keys, for each product in the shop which has the warnings set to something.
mark_failed_products_as_redo(@skus)
Alter the status of the failed skus passed as argument from 'failed' to 'redo' to trigger an update.
get_products_with_amazon_shop_mismatches(@errors)
Parse the amazon_mws_products and return an hashref where the keys are the failed skus, and the values are hashrefs where the keys are the mismatched fields and the values are hashrefs with these keys:
Mismatched fields may be: part_number
, title
, manufacturer
, brand
, color
, size
- shop
-
The value on the shop
- amazon
-
The value of the amazon product
- error_code
-
The error code
E.g.
my $mismatches = {12344 => {
part_number => {
shop => 'XY',
amazon => 'XYZ',
error_code => '8541',
},
title => {
shop => 'ABC',
amazon => 'DFG',
error_code => '8541',
},
},
.....
};
Optionally, if the error codes are passed to the argument, only those errors are fetches.
get_products_with_mismatches(@errors)
Similar to get_products_with_amazon_shop_mismatches
, but instead return an arrayref where each element is a hashref with all the info collapsed.
The structures reported by get_products_with_amazon_shop_mismatches
are flattened with an our_
and amazon_
prefix.
our_part_number => 'XY',
amazon_part_number => 'YZ',
our_title = 'xx',
amazon_title => 'yy',
# etc.
Order Report
To get this feature working, you need an amzn-envelope.xsd
with OrderReport plugged in. Older versions are broken. Newer schema versions may have missing Amazon.xsd file. So either you ask amazon to give you a full set of xsd, which inclused OrderReport in amzn-envelope.xsd or you apply this patch to amzn-envelope.xsd:
--- a/amzn-envelope.xsd 2014-10-27 10:26:19.000000000 +0100
+++ b/amzn-envelope.xsd 2015-03-26 10:56:16.000000000 +0100
@@ -23,2 +23,3 @@
<xsd:include schemaLocation="Price.xsd"/>
+ <xsd:include schemaLocation="OrderReport.xsd"/>
<xsd:include schemaLocation="ProcessingReport.xsd"/>
@@ -41,2 +42,3 @@
<xsd:enumeration value="OrderFulfillment"/>
+ <xsd:enumeration value="OrderReport"/>
<xsd:enumeration value="Override"/>
@@ -83,2 +85,3 @@
<xsd:element ref="OrderFulfillment"/>
+ <xsd:element ref="OrderReport"/>
<xsd:element ref="Override"/>
get_unprocessed_orders
Return a list of objects with the orders.
get_unprocessed_order_report_ids
Return a list of unprocessed (i.e., which weren't acknowledged by us) order report ids.
get_order_reports_by_id(@id_list)
The GetReport operation has a maximum request quota of 15 and a restore rate of one request every minute.
acknowledge_reports(@ids)
Mark the reports as processed.
unacknowledge_reports(@ids)
Mark the reports as not processed.
job_timed_out($job_row) [INTERNAL]
Check if the hashref (which is a hashref of the amazon_mws_jobs row) has timed out, comparing with the order_ack_days_timeout
and job_hours_timeout
(depending on the job).
purge_old_jobs($limit)
Eventually the jobs and feed tables grow and never get purged. You can call this method to remove from the db all the feeds older than order_ack_days_timeout
(30 by default).
To avoid too much load on the db, you can set the limit to purge the jobs. Defaults to 500. Set it to 0 to disable it.