NAME

LaBrea::Tarpit::Get

SYNOPSIS

use LaBrea::Tarpit::Get;

($rv,$host,$port,$path)=parse_http_URL($url)
($handle,$host,$port,$path)=open_http(*S,$url);
$rv=parse_http_response(\$buffer,\%response);
$rv=short_response($url,\%response,\%content,$timeout);
$line = make_line($url,$err,\%content);
$rv = not_hour($file);
$rv = not_day($file);
$rv=auto_update($url,$file,$cur_ver,$timeout);

DESCRIPTION - LaBrea::Tarpit::Get

Module connects to a web site running LaBrea::Tarpit::Report::html_report.plx and retrieves a short_report as described in LaBrea::Tarpit::Report.

Run examples/web_scan.pl from a cron job hourly or daily to update the statistics from all know sites running LaBrea::Tarpit. A report can then be generated showing the activity worldwide.

# MIN HOUR DAY MONTH DAYOFWEEK   COMMAND
30 * * * * ./web_scan.pl ./other_sites.txt ./tmp/site_stats

See: LaBrea::Tarpit::Report::other_sites

($handle,$host,$port,$path)= parse_http_URL($url);

Separate an http URL into its components

  input:	URL of the form
	http://www.foo.com[:8080]/file.html

  https:// service is not supported

  returns: (undef, error message)
		or
	   (file_handle,hostname,port,path)
	where port and path may be empty
($handle,$host,$port,$path)=open_http(*S,$url);

Open connection to http target

  input:	*S,$url	[default port = 80]
  returns:	(undef, error)	on error

		(file_handle,
		 hostname,
		 port
		 path )		on success
$rv=parse_http_response(\$buffer,\%response);

Parse an http server response into a hash of headers.

  i.e.	(representative, will vary)

  rc		 => 200
  msg		 => OK
  date		 => Wed, 24 Apr 2002 21:46:30 GMT
  server	 => Apache/1.3.22
  protocol	 => HTTP/1.1
  content-type	 => text/plain
  content-length => 92
  last-modified	 => Wed, 24 Apr 2002 21:46:34 GMT
  expires	 => Wed, 24 Apr 2002 21:47:04 GMT
  connection	 => close
  content	 => (complete text buffer)

  input:	\$text_in, \%response
  returns:	true on success, %response filled
		false on failure

  NOTE:		%response{rc}	(server response code)
		%response(msg}	(server messages)
		are ALWAYS filled with something.
		In the case of server failure, the 
		cause of the failure will be inserted
		into %response(msg} and undef returned.
$rv=short_response($url,\%response,\%content,$timeout);

Fetch the short report from $url and place the headers in %response, the content, parsed, in %content. Optional $timeout, default is 60 seocnds.

%response contains http headers

%content contains key => value pairs

  LaBrea	=> version
  Tarpit	=> version
  Report	=> version
  Util		=> version
  now		=> seconds since epoch (local)
  tz		=> time zone (i.e. -0700)
  threads	=> number of threads
  total_IPs	=> total IP's
  bw		=> bandwidth

  input:	URL,	# complete url
	   i.e. www.foo.com/html_report.plx
		\%response,
		\%content,

  returns:	false on success
		error message on failure
$line = make_line($url,$err,\%content);

Make a line of text summarizing the short report where $err is the return value from short_report

Format:

url threads total_IPs bw time tz version:nn:nn:nn
  or
url error message
$rv = not_hour($file);

Check if the file has been accessed this hour;

  input:	path/to/file
  returns:	true, not current hour
		false if accessed this hour
		or non-existent or not readable
$rv = not_day($file);

Check if the file has been accessed this day;

  input:	path/to/file
  returns:	true, not accessed this day
		false if accessed this day
		or non-existent or not readable
$rv=auto_update($url,$file,$cur_ver,$timeout);

Update the 'other_sites.txt' file from $url on a daily basis only.

  input:  url,	# complete url to 'other_sites.txt'
	# http://scans.bizsystems.net/other_sites.txt

          file,	# path to your 'other_sites.txt'

	  cur_ver	# optional current version
	# the current file will be opened and scanned
	# if this is not supplied

	  timeout	# wait for http response	
	# default 60 seconds
  returns:	false on success or no update needed
		error msg on failure

EXPORT_OK

parse_http_URL
open_http
parse_http_response
short_response
make_line
not_hour
not_day
auto_update

COPYRIGHT

Copyright 2002 - 2004, Michael Robinton & BizSystems This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

AUTHOR

Michael Robinton, michael@bizsystems.com

SEE ALSO

perl(1), LaBrea::Tarpit(3), LaBrea::Codes(3), LaBrea::Tarpit::Report(3), LaBrea::Tarpit::Util(3), LaBrea::Tarpit::DShield(3)