NAME
Mail::Graph - draw graphical stats for mails/spams
SYNOPSIS
use
Mail::Graph;
$graph
= Mail::Graph->new(
items
=>
'spam'
,
output
=>
'spams/'
,
input
=>
'~/Mail/spam/'
,
);
$graph
->generate();
DESCRIPTION
This module parses mailbox files in either compressed or uncompressed form and then generates pretty statistics and graphs about them. Although at first developed to do spam statistics, it works just fine for normal mail.
File Format
The module reads in files in mbox format. These can be compressed by gzip, or just plain text. Since the module read in any files that are in one directory, it can also handle mail-dir style folders, e.g. a directory where each mail resides in an extra file.
The file format is quite simple and looks like this:
From sample_foo
@example
.com Tue Oct 27 18:38:52 1998
Received: from barfel by foo.example.com (8.9.1/8.6.12)
From: forged_bar
@example
.com
X-Envelope-To: <sample_foo
@example
.com>
Date: Tue, 27 Oct 1998 09:52:14 +0100 (CET)
Message-Id: <199810270852.12345567
@example
.com>
To: <none
@example
.com>
Subject: Sorry...
X-Loop-Detect: 1
X-Spamblock: caught by rule dummy@
This is a sample spam
Basically, an email header plus email body, separated by the From
lines.
The following fields are examined to determine:
X-Envelope-To the target address/domain
From address
@domain
the sender
From date the receiving date
METHODS
new()
Create a new Mail::Graph object.
The following options exist:
input Path to a directory containing (gzipped) mbox files
Alternatively, name of an (gzipped) mbox file
index
Directory where to
write
(and
read
) the
index
files
output Directory where to
write
the output stats
items Try
'spams'
or
'mails'
(can be any string)
generate hash
with
names of stats to generate (1=on, 0=off):
month per
each
month of the year
day per
each
day of the month
hour per
each
hour of the day
dow per
each
day of the week
yearly per year
daily per
each
day (
with
average)
monthly per
each
month
toplevel per top_level domain
rule per filter rule that matched
target per target address
domain per target domain
last_x_days items
for
each
of the
last
x days
set it to the number of days you want
score_histogram show histogram of SpamAssassin scores
set it to the step-width (like 5)
score_daily SA score
for
each
of the
last
x days
set it to the number of days you want
score_scatter SA scatter score diagram, set it to
the limit of the score (a line will be
draw there)
average set to 0 to disable, otherwise it gives the number
of days/weeks/month to average over
average_daily
if
not set, uses average, 0 to disable
number of days to average over in the daily graph
height base height of the generated images
template name of the template file (ending in .tpl) that is
used to generate the html output, e.g.
'index.tpl'
no_title set to 1 to disable graph titles,
default
0
filter_domains array
ref
with
list of domains to show as
"unknown"
filter_target array
ref
with
list of targets (regualr expressions)
graph_ext extension of the generated graphs,
default
'png'
last_date in yyyy-mm-dd
format
: specify the
last
used date, any
mail newer than that will be skipped. Defaults to today
first_date in yyyy-mm-dd
format
: specify the first used date, any
mail older than that will be skipped. Defaults to
undef
meaning any old mail will be considered.
generate()
Generate the stats, fill in the template and write it out. Takes no options.
error()
Return an error message or undef for no error.
BUGS
There are a couple of known bugs, some of the are unfinished features or problem of GD::Graph:
- Divide by Zero
-
This is a bug in some versions of GD::Graph, when generating a graph with only one bar it will crash with this error. If you encounter this, please bug the author of GD::Graph and send me a copy.
- Argument "4, 0.7%" isn't numeric
-
You might get a lot of warnings like
Argument
"4, 0.7%"
isn't numeric in numeric lt (<) at
/usr/lib/perl5/site_perl/5.8.2/GD/Graph/Data.pm line 231.
This is a problem with GD::Graph: Mail::Graph wants to use labels like
4, 0.7%
but GD::Graphs uses the same string for the label and the value of the point/bar. And thus Perl warns. This needs a small patch to GD::Graph that strips anything non-numeric out of the label before using it in numeric context. Please bug the author of GD::Graph and send me a copy. - gzipped archives are not included in the stats
-
Some of the gzipped archives seem to trigger some bug in Compress::Zlib, at least til version v1.32. For instance, on my system on of the sample archives in
/sample/archives/
is not read properly by Compress::Zlib. I already have notified the author of Compress::Zlib.
LICENSE
This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself.
AUTHOR
(c) Copyright by Tels http://bloodgate.com/ 2002.