NAME
checksite - Check the contents of a website
SYNOPSIS
$ checksite [options] -p <name> uri
OPTIONS
--prefix|-p <name> The prefix (dir) of this check [mandatory]
--dir|-d <dir> The target directory
--[no]save Save validation results
--load Load the validation results
--novalidate Skip the W3 validation
--by_uri Validate by sending the uri to W3
--by_upload Validate by uploading the contents to W3
--nostrictrules Do not impose /robots.txt on the validator
--lang|-l <lang> Set language(s) for Accept-Language: header
-v increase verbosity (multiple)
--help|-h This message
DESCRIPTION
This program will spider the specified url and check the availability of the links, images and stylesheets on each page. Pages and stylesheets are also validated with the validators available at http://validator.w3.org and http://jigsaw.w3.org.
When all pages are checked two reports in HTML-format are generated. The full.html report contains all the information for all pages and the summ.html report contains only the pages with errors and their errors.
Metrics for a spidered page
Each page fetched by the spider will have these metrics:
status, status_tx
The HTTP-returncode and a verbal explanation of that code
title
The contents of the
<title></title>
tag.ct
The MIME type returned by the HTTP-server for the document.
valid
The HTML-validation result.
links
A list of
<a href=>
,<area href=>
and<frame src=>
uri's found on the page with the HTTP-returncode. Each HTML-code is also checked for the text or ALT/TITLE attribute.link_cnt, links_ok
The number of links found and the number of links that are ok.
images
A list of
<img src=>
and<input type=image>
uri's found on the page with the HTTP-returncode and MIME type. Each HTML tag is also checked for the existance of the ALT attribute.image_cnt, images_ok
The number of images found and the number of images that are ok.
styles
A list of
<link rel=stylesheet type=text/css>
uri's found on the page with the HTTP-returncode, MIME type and CSS-validation result.style_cnt, styles_ok
The number of stylesheets found and the number of stylesheets that are ok.
SEE ALSO
AUTHOR
Abe Timmerman, <abeltje@cpan.org>
BUGS
Please report any bugs or feature requests to bug-WWW-CheckSite@rt.cpan.org
, or through the web interface at http://rt.cpan.org. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.
COPYRIGHT & LICENSE
Copyright MMV Abe Timmerman, All Rights Reserved.
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.