NAME
Net::Async::Firecrawl - IO::Async Firecrawl v2 client with flow helpers
VERSION
version 0.001
SYNOPSIS
use IO::Async::Loop;
use Net::Async::Firecrawl;
my $loop = IO::Async::Loop->new;
my $fc = Net::Async::Firecrawl->new(
base_url => 'http://localhost:3002', # or https://api.firecrawl.dev
api_key => 'fc-...', # optional for self-hosted
poll_interval => 3,
);
$loop->add($fc);
# Single scrape
my $doc = $fc->scrape( url => 'https://example.com', formats => ['markdown'] )->get;
# Crawl a site, poll to completion, collect all paginated pages, split by is_failure.
my $result = $fc->crawl_and_collect(
url => 'https://example.com',
limit => 100,
)->get;
# $result->{data} — ok pages only
# $result->{failed} — [{ url, statusCode, error, page }, ...]
# $result->{raw_data} — all pages in original order
# $result->{stats} — { ok, failed, total }
# Batch scrape, waits for all results.
my $batch = $fc->batch_scrape_and_wait(
urls => [ 'https://a', 'https://b' ],
formats => ['markdown'],
)->get;
# Structured extraction.
my $extract = $fc->extract_and_wait(
urls => [ 'https://example.com/*' ],
prompt => 'extract pricing and product names',
)->get;
# Concurrent per-URL scrape (partial-success).
my $many = $fc->scrape_many(
[qw( https://a https://b https://c )],
formats => ['markdown'],
)->get;
# $many->{ok} — [{ url, data }, ...]
# $many->{failed} — [{ url, error }, ...] — $error is a WWW::Firecrawl::Error
# Retry the failed URLs from a prior crawl/batch.
my $retried = $fc->retry_failed_pages($result, formats => ['markdown'])->get;
DESCRIPTION
IO::Async-flavoured client for the Firecrawl v2 API. Wraps WWW::Firecrawl's request builders and response parsers, dispatches through Net::Async::HTTP, and returns Future objects.
Every endpoint exposed by WWW::Firecrawl is available here as a Future-returning method with identical argument signatures. On top of that, high-level flow helpers automate the start-job → poll → collect-pages pattern common to crawl, batch-scrape, extract, and agent operations — including partial-success splitting by the classification policy of the underlying WWW::Firecrawl.
CONSTRUCTOR PARAMETERS
base_url,api_key,api_version— forwarded to WWW::Firecrawl.firecrawl— pass a pre-built WWW::Firecrawl instance (overrides the above three).http— pass a pre-built Net::Async::HTTP (otherwise one is created and parented to this notifier).poll_interval— seconds between status polls for flow helpers (default 3).delay_sub— optional CodeRef that returns a Future for inter-attempt and polling delays. If omitted,$loop->delay_futureis used. Mainly a test hook.
Retry attributes (max_attempts, retry_backoff, retry_statuses, on_retry) and classification attributes (is_failure, failure_codes, strict) live on the underlying WWW::Firecrawl. Pass them to this constructor or build the WWW::Firecrawl instance yourself and pass it as firecrawl.
ERROR HANDLING
Every failure path resolves as Future->fail($error, 'firecrawl', $attempt?) where $error is a WWW::Firecrawl::Error object (stringifies to its message). $f->failure returns ($error, 'firecrawl', $attempt?).
Five error types (same model as WWW::Firecrawl):
transport— Firecrawl unreachable. Retried automatically up tomax_attempts.api— Firecrawl returned non-2xx or{success: false}. Retried only forretry_statuses(default 429/502/503/504).job— A flow reportedstatus: failedorstatus: cancelled. Never retried — always propagates as Future fail.scrape— Single-scrape's target URL was classified as failed (only raised whenstrictis on).page— A target URL inside a flow (scrape_many, or a failed entry within crawl/batch) was classified as failed. Surfaced infailed[], not thrown.
Classic usage:
$fc->scrape( url => $u )->then(sub {
my $data = shift;
...
})->else(sub {
my ( $err ) = @_;
if ($err->is_transport) { ... }
elsif ($err->is_job) { ... }
else { warn "firecrawl: $err"; Future->fail($err) }
});
scrape
crawl
crawl_status
crawl_cancel
crawl_errors
crawl_active
crawl_params_preview
map
search
batch_scrape
batch_scrape_status
batch_scrape_cancel
batch_scrape_errors
extract
extract_status
agent
agent_status
agent_cancel
browser_create
browser_list
browser_delete
browser_execute
scrape_execute
scrape_browser_stop
credit_usage
credit_usage_historical
token_usage
token_usage_historical
queue_status
activity
One Future-returning method per WWW::Firecrawl endpoint, same argument signature. Resolves to the parsed payload on success. See WWW::Firecrawl for per-endpoint details.
crawl_status_next($next_url)
batch_scrape_status_next($next_url)
Follow a pagination URL from a previous status response.
crawl_and_collect(%crawl_args)
Fires crawl, polls crawl_status every poll_interval seconds until the job reports completed (failed/cancelled fail the Future with type=job), walks the next pagination chain, classifies each collected page via the underlying WWW::Firecrawl's is_failure, and resolves to:
{
status => 'completed',
id => $job_id,
creditsUsed => ...,
data => [ ok_page, ... ], # ok only
failed => [ { url, statusCode, error, page }, ... ],
raw_data => [ page, ... ], # all, original order
stats => { ok, failed, total },
}
batch_scrape_and_wait(%batch_args)
Same contract as crawl_and_collect but against the batch-scrape endpoints. Same return shape.
extract_and_wait(%extract_args)
Starts an extract job and resolves once extract_status reports completed. Fails (type=job) on failed/cancelled. Returns the final status hash.
agent_and_wait(%agent_args)
Like extract_and_wait, for agent jobs.
scrape_many(\@urls, %common_scrape_args)
Fires a scrape per URL concurrently. Resolves to:
{
ok => [ { url, data }, ... ],
failed => [ { url, error }, ... ], # error is a WWW::Firecrawl::Error
stats => { ok, failed, total },
}
The outer Future never fails for per-URL failures (transport, api, or target-level). It only fails for local errors (e.g. not added to a loop).
retry_failed_pages($result, %scrape_opts)
Takes a result from crawl_and_collect / batch_scrape_and_wait / scrape_many and re-scrapes the URLs in $result->{failed} via scrape_many. Returns a Future of the standard { ok, failed, stats } hashref.
do_request($http_request)
Low-level: dispatch an arbitrary HTTP::Request (typically one built via $self->firecrawl->foo_request) through the underlying Net::Async::HTTP with retry applied. Returns a Future of HTTP::Response.
firecrawl
The underlying WWW::Firecrawl instance.
http
The underlying Net::Async::HTTP instance (lazily built and parented to this notifier).
poll_interval
Read/write accessor for the default poll interval (seconds) used by flow helpers.
SEE ALSO
WWW::Firecrawl, IO::Async, Net::Async::HTTP, Future, https://firecrawl.dev, https://docs.firecrawl.dev/api-reference/v2-introduction
SUPPORT
Issues
Please report bugs and feature requests on GitHub at https://github.com/Getty/p5-net-async-firecrawl/issues.
CONTRIBUTING
Contributions are welcome! Please fork the repository and submit a pull request.
AUTHOR
Torsten Raudssus <torsten@raudssus.de> https://raudss.us/
COPYRIGHT AND LICENSE
This software is copyright (c) 2026 by Torsten Raudssus.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.