NAME

Apache::Logmonster

SYNOPSIS

Processor for Apache logs

DESCRIPTION

A tool to collect log files from multiple Apache web servers, split them based on the virtual host, sort the logs into cronological order, and then pipe them into a log file analyzer of your choice (webalizer, http-analyze, AWstats, etc).

FEATURES

Log Retrieval from one or multiple hosts
Ouputs to webalizer, http-analyze, and AWstats.
Automatic configuration by reading Apache config files.
Outputs stats into each virtual domains stats dir, if that directory exists. (HINT: Easy way to enable or disable stats for a virtual host).
Efficient: uses Compress::Zlib to read directly from .gz files to minimize disk use. Skips processing logs for vhosts with no $statsdir. Doesn't sort if you only have logs from one host.
Flexible: you can run it monthly, daily, or hourly
Reporting: saves an on disk activity report and an email friendly report.
Reliable: lots of error checking so if something goes wrong, it'll give you a useful error message.
Understands and correctly deals with server aliases

INSTALLATION

Step 1 - Download and install (it's FREE!)

http://www.tnpi.biz/store/product_info.php?cPath=2&products_id=40

Install like every other perl module:

perl Makefile.PL
make test
make install 

To install the config file use "make conf" or "make newconf". newconf will overwrite any existing config file, so use it only for new installs.

Step 2 - Edit logmonster.conf
vi /usr/local/etc/logmonster.conf
Step 3 - Edit httpd.conf

Adjust the CustomLog and ErrorLog definitions. We make two changes, adding %v (the vhost name) to the CustomLog and adding cronolog to automatically rotate the log files.

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" %v" combined
CustomLog "| /usr/local/sbin/cronolog /var/log/apache/%Y/%m/%d/access.log" combined
ErrorLog "| /usr/local/sbin/cronolog /var/log/apache/%Y/%m/%d/error.log"
Step 4 - Test manually, then add to cron.
crontab -u root -e
5 1 * * * /usr/local/sbin/logmonster -d
Step 5 - Read the FAQ

http://www.tnpi.biz/internet/www/logmonster/faq.shtml

Step 6 - Enjoy

Allow Logmonster to make your life easier by handling your log processing. Enjoy the daily summary emails, and then express your gratitude by making a small donation to support future development efforts.

DEPENDENCIES

Compress::Zlib
Date::Parse (TimeDate)

report_hits

report_hits reads a days log file and reports the results back to standard out. The logfile contains key/value pairs like so:

matt.simerson:4054
www.tnpi.biz:15381
www.nictool.com:895

This file is read by logmonster when called in -r (report) mode and is expected to be called via an SNMP agent.

report_open

In addition to emailing you a copy of the report, Logmonster leaves behind a copy in the log directory.

check_stats_dir

Each domain on your web server is expected to have a "stats" dir. I name mine "stats" and locate in their DocumentRoot, owned by root so that the user doesn't delete it. This sub first goes through the list of files in (by default) /var/log/apache/tmp/doms, which is a file for each vhost. The file name matches the vhost name the contents are the log entries that correspond to that vhost.

If the file is zero bytes, it deletes it as there is nothing to do.

Otherwise, it gathers the vhost name from the file and checks the %domains hash to see if a directory path exists for that vhost. If no hash entry is found or the entry is not a directory, then we declare the hits unmatched and discard them.

For log files with entries, we check inside the docroot for a stats directory. If no stats directory exists, then we discard those entries as well.

feed_the_machine

feed_the_machine takes the sorted vhost logs and feeds them into the stats processor that you chose.

sort_vhost_logs

At this point, we'll have collected the Apache logs from each web server and split them up based on which vhost they were served for. However, our stats processors (most of them) require the logs to be sorted in cronological date order. So, we open up each vhosts logs for the day, read them into a hash, sort them based on their log entry date, and then write them back out.

split_logs_to_vhosts

After collecting the log files from each server in the cluster, we need to split them up based upon the vhost they were intended for. This sub does that.

AUTHOR

Matt Simerson <matt@tnpi.biz>

BUGS

None known. Report any to author.

TODO

Support for analog.

Support for individual webalizer.conf file for each domain

Delete log files older than X days/month

Do something with error logs (other than just compress)

If files to process are larger than 10MB, find a nicer way to sort them rather than reading them all into a hash. Now I create two hashes, one with data and one with dates. I sort the date hash, and using those sorted hash keys, output the data hash to a sorted file. This is necessary as wusage and http-analyze require logs to be fed in chronological order. Take a look at awstats logresolvemerge as a possibility.

SEE ALSO

http://www.tnpi.biz/internet/www/logmonster

COPYRIGHT

Copyright (c) 2003-2004, The Network People, Inc. All rights reserved.

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

Neither the name of the The Network People, Inc. nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DIS CLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.