NAME

slpolice - Warn and renice top cpu hogs

SYNOPSIS

slpolice [ --help ] [ --port=port ] [ --dhost=host ] [ --warn_any ] [ --version ]

DESCRIPTION

slpolice will determine the top cpu users across a cluster of hosts. It will send mail if a process has over 1 hour of cpu time, and if the nice value of that process is not 10, renice the process to 19.

Mail is sent to the user who is reniced. Mail is also sent if --warn_any is used and the CPU limit is exceeded, even if the nice value is 10.

Usually slpolice is run with a crontab entry similar to:

5 7 * * * /usr/local/bin/slpolice --warn_any >/dev/null 2>&1
5 8-21 * * * /usr/local/bin/slpolice >/dev/null 2>&1

This sends warnings the first hour that a process violates, and reminders every day at 7 am. It does not check at night so that long overnight jobs will not receive warnings.

ARGUMENTS

--help

Displays this message and program version and exits.

--port <portnumber>

Specifies the port number that slchoosed uses.

--dhost <hostname>

Specifies the host name that slchoosed uses. May be specified multiple times to specify backup hosts. Defaults to SLCHOOSED_HOST environment variable, which contains colon separated host names.

--warn_any

Specifies that any jobs with over a hour should produce a warning, even if the job is already niced.

--version

Displays program version and exits.

SEE ALSO

nicercizerd, Schedule::Load

This program is most valuable when used with the nicercizerd program, or a operating system where nice 19 processes get only leftover cpu resources. It requires a program called nice19 which is a version of nice that is setgid root and renices a job to 19. This comes with nicercizerd.

DISTRIBUTION

This package is distributed via CPAN.

Nicercizerd is available from http://www.ultranet.com/~wsnyder/veripool.

AUTHORS

Wilson Snyder <wsnyder@world.std.com>

1 POD Error

The following errors were encountered while parsing the POD:

Around line 164:

Unterminated B<...> sequence