NAME

smonitor - command-line tool for monitoring services

VERSION

version 0.2.3

SYNOPSIS

smonitor [-cfg <file>] [<output-options>] [<log-options>] [<other-options>]
   where <output-options> are: -outfile <file>
                               -onlyerr
                               -format human | tsv | html
                               -cssurl <url>
   where <log-options> are:    -logfile <file>
                               -loglevel debug | info | warn | error | fatal
                               -debug
                               -logformat <template>
   where <other-options> are:  -npp <integer>
                               -service[s] <service-name> [<service-name>...]
                               -nonotif

smonitor -showcfg
smonitor -lf

smonitor -h
smonitor -help
smonitor -man
smonitor -version

DESCRIPTION

smonitor is a command-line tool for monitoring (checking) various services and other parts of your IT infrastructure. In order to run it, you need to have a configuration file that defines what services to check and how to do the checking. Details how to create such configuration file are in the documentation of the Perl module Monitor::Simple; just type:

perldoc Monitor::Simple

OPTIONS

The command-line arguments and options can be specified with single or double dash. Most of them can be abbreviated to the nearest un-biased length. They are case-sensitive.

-cfg <config-file>

It specifies what configuration file to use (read: what services to check). By default, it uses file monitor-simple-cfg.xml.

-service <service-name> [<service-name>...]

By default, smonitor checks all services specified in the configuration file. This parameter can select only some services. For example:

smonitor -cfg my.cfg -service synonia mrs

-outfile <file>

It specifies a file where the report about checking is written to. By default, it is written to standard output (but see also possible combinations with -onlyerr option).

A note about notifications: This parameter -outfile has nothing to do with notifications. The notifications are messages about the status of individual services and they are defined (if at all) in the configuration file (where it is also specified where to send them and how to format them). You cannot influence notifications by any parameter of the smonitor. Well, it is not entirely true: You can use parameter -nonotif to disable all notifications.

-onlyerr

This option influences what will be reported on the standard output (STDOUT). The overall behaviour depends on the combination of -outfile and -onlyerr parameters:

-outfile <file>    -onlyerr    what will be done
-----------------------------------------------------
yes                no         all output to <file>

yes                yes        all output to <file>
                              + errors also on STDOUT

no                 no         all output to STDOUT

no                 yes        only errors to STDOUT

The variety of output destinations allows to run smonitor as a "cron" job (a scheduled job) and to decide when the scheduling system reports the results. Just remember that these reports, sent by the scheduling system, are not the same as notifications defined in the configuration file - these two ways how to report status of services are independent and both can be used in the same time.

-format human | tsv | html

How the report will be formatted. Default is human:

DATE                           SERVICE                 STATUS  MESSAGE
Tue Sep 27 10:40:15 2011       Memory Check                 2  Memory CRITICAL - 91.7% (1601124 kB) used
Tue Sep 27 10:40:15 2011       Current timestamp            0  Tue Sep 27 10:40:15 2011
Tue Sep 27 10:40:15 2011       Born To Be Killed            2  Plugin 'Monitor/Simple/plugins/born-to-be-killed.pl' died with signal 9

Tue Sep 27 10:40:15 2011       Synonia Bad Params           2  500 Can't connect to Xdb.cbrc.kaust.edu.sa:80...

The tsv is a TAB-separated output, without any header line. The html format creates a simple HTML page with the report.

-cssurl <url>

This is used only when -format html is used. It specifies a URL of a CSS-stylesheet that can change the look-and-feel of the HTML report page. See the source of the page for the CSS-classes names.

-npp <integer>

A not much used parameter, rather a technical one: it specifies maximum number of parallel checks. Because each check is done by a new process, the npp parameter actually stands for "Number of Parallel Processes". Default is 10.

-nonotif

This options disables executing all notifications (as they are defined in the configuration file). It is, for example, useful when you are testing a new configuration file and you do not wish to send emails, etc. about it

Logging options

Additional to the report about the status of services (parameters -outfile and -onlyerr) and to the notifications (defined in the configuration file) there is also a logging mechanism that helps to trace how the checking is done in more details. The logging is defined by few logging parameters - they all have reasonable default values.

-logfile <logfile>

Where to put log records. By default, it appends records to the file smonitor.log (which is created if it does not exist yet). You can also specify STDOUT as the logfile:

-logfile STDOUT
-loglevel debug | info | warn | error | fatal

Each log record has its level of importance (five possible levels: from debug to fatal). This parameter tells which log records (read: records of what importance) will be created. A level means also all levels "below" it. For example, level warn includes warn, error and fatal messages. Default level is info.

-debug

It is the same as -loglevel debug.

-logformat <string>

It specifies how to format log records. Default format is "%d (%r) %p %m%n>" when the log records look like this:

2011/09/27 12:18:50 (97)  INFO> --- Checking started ---
2011/09/27 12:18:50 (100) DEBUG> Started: Monitor/Simple/plugins/check_mem.pl -u -w 55 -c 80
2011/09/27 12:18:50 (100) DEBUG> Started: Monitor/Simple/plugins/check-url.pl -cfg configs/simple-example-cfg.xml -service pubmed -logfile a.log -loglevel debug
2011/09/27 12:18:50 (30)  DEBUG> Invoking HTTP HEAD: http://www.ncbi.nlm.nih.gov/pubmed/
2011/09/27 12:18:51 (760) INFO> Done: pubmed 0 OK
2011/09/27 12:18:51 (971) INFO> --- Checking finished [0.872014999389648 s] ---

The columns in this examples are: date (%d), number of milliseconds from the moment smonitor was started (%r), log level (%p) and log message (%m). More details about formats are in the Perl module Log::Log4perl.

-showcfg

It prints the name of the used configuration file and its content and it exits. It is rather for debugging.

-lf

It prints the currently available formats (the values recognizable by the -format parameter) and exits:

$> smonitor -lf
html    Formatted as an HTML document
human   Easier readable by humans
tsv     TAB-separated (good for machines)

General options

-h

Print a brief usage message and exits.

-help

Print a brief usage message with options and exits.

-man

Print a full usage message and exits.

-version

Print the version and exits.

AUTHOR

Martin Senger <martin.senger@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2011 by Martin Senger, KAUST (King Abdullah University of Science and Technology) All Rights Reserved..

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.