NAME
LWP::UserAgent - A WWW UserAgent class
SYNOPSIS
require LWP::UserAgent;
$ua = new LWP::UserAgent;
$request = new HTTP::Request('GET', 'file://localhost/etc/motd');
$response = $ua->request($request); # or
$response = $ua->request($request, '/tmp/sss'); # or
$response = $ua->request($request, \&callback, 4096);
sub callback { my($data, $response, $protocol) = @_; .... }
DESCRIPTION
LWP::UserAgent
is a class implementing a simple World-Wide Web user agent in Perl. It brings together the HTTP::Request, HTTP::Response and the LWP::Protocol classes that form the rest of the libwww-perl library. For simple uses this class can be used directly to dispatch WWW requests, alternatively it can be subclassed for application-specific behaviour.
In normal usage the application creates a UserAgent object, and configures it with values for timeouts proxies, name, etc. The next step is to create an instance of HTTP::Request
for the request that needs to be performed. This request is then passed to the UserAgent request()
method, which dispatches it using the relevant protocol, and returns a HTTP::Response
object.
The basic approach of the library is to use HTTP style communication for all protocol schemes, i.e. you will receive an HTTP::Response
object also for gopher or ftp requests. In order to achieve even more similarities with HTTP style communications, gopher menus and file directories will be converted to HTML documents.
The request
method can process the content of the response in one of three ways: in core, into a file, or into repeated calls of a subroutine. The in core variant simply returns the content in a scalar attribute called content()
of the response object, and is suitable for small HTML replies that might need further parsing. The filename variant requires a scalar containing a filename, and is suitable for large WWW objects which need to be written directly to disc, without requiring large amounts of memory. In this case the response object contains the name of the file, but not the content. The subroutine variant requires a callback routine and optional chuck size, and can be used to construct "pipe-lined" processing, where processing of received chuncks can begin before the complete data has arrived. The callback is called with 3 arguments: a the data, a reference to the response object and a reference to the protocol object.
The library also accepts that you put a subroutine as content in the request object. This subroutine should return the content (possibly in pieces) when called. It should return an empty string when there is no more content.
Two advanced facilities allow the user of this module to finetune timeouts and error handling:
By default the library uses alarm() to implement timeouts, dying if the timeout occurs. If this is not the prefered behaviour or it interferes with other parts of the application one can disable the use alarms. When alarms are disabled timeouts can still occur for example when reading data, but other cases like name lookups etc will not be timed out by the library itself.
The library catches errors (such as internal errors and timeouts) and present them as HTTP error responses. Alternatively one can switch off this behaviour, and let the application handle dies.
SEE ALSO
See LWP for a complete overview of libwww-perl5. See request and mirror for examples of usage.
METHODS
new()
Constructor for the UserAgent.
$ua = new LWP::UserAgent;
$ub = new LWP::UserAgent($ua); # clone existing UserAgent
isProtocolSupported($scheme)
You can use this method to query if the library currently support the specified scheme
. The scheme
might be a string (like 'http' or 'ftp') or it might be an URI::URL object reference.
simpleRequest($request, [$arg [, $size]])
This method dispatches a single WWW request on behalf of a user, and returns the response received. The $request
should be a reference to a HTTP::Request
object with values defined for at least the method()
and url()
attributes.
If $arg
is a scalar it is taken as a filename where the content of the response is stored.
If $arg
is a reference to a subroutine, then this routine is called as chunks of the content is received. An optional $size
argument is taken as a hint for an appropriate chunk size.
If $arg
is omitted, then the content is stored in the response object.
request($request, $arg [, $size])
Process a request, including redirects and security. This method may actually send several different simple reqeusts.
The arguments are the same as for simpleRequest()
.
redirectOK
This method is called by request() before it tries to do any redirects. It should return a true value if the redirect is allowed to be performed. Subclasses might want to override this.
getBasicCredentials($realm, $uri)
This is called by request() to retrieve credentials for a Realm protected by Basic Authentication.
Should return username and password in a list. Return undef to abort the authentication resolution atempts.
This implementation simply checks a set of pre-stored member variables. Subclasses can override this method to e.g. ask the user for a username/password. An example of this can be found in request
program distributed with this library.
mirror($url, $file)
Get and store a document identified by a URL, using If-Modified-Since, and checking of the content-length. Returns a reference to the response object.
timeout()
agent()
useAlarm()
useEval()
Get/set member variables, respectively the timeout value in seconds, the name of the agent, wether to use alarm()
or not, and wether to use handle internal errors internally by trapping with eval.
proxy(...)
Set/retrieve proxy URL for a scheme:
$ua->proxy(['http', 'ftp'], 'http://www.oslonett.no:8001/');
$ua->proxy('gopher', 'http://web.oslonett.no:8001/');
The first form specifies that the URL is to be used for proxying of access methods listed in the list in the first method argument, i.e. 'http' and 'ftp'.
The second form shows a shorthand form for specifying proxy URL for a single access scheme.
envProxy()
$ua->envProxy();
Load proxy settings from *_proxy environment variables.
noProxy($domain)
$ua->noProxy('localhost', 'no', ...);
Do not proxy requests to the given domains. Calling noProxy without domains clears the list of domains.