NAME

slpolice - Warn and renice top CPU hogs

SYNOPSIS

slpolice [ --help ] [ --port=port ] [ --dhost=host ] [ --cpu-hours ] [ --version ] [ --version ]

DESCRIPTION

slpolice will determine the top cpu users across a cluster of hosts. It will send mail if a process has over a specified amount of cpu time.

It will also mail if a user has a reservation for a long period of time.

Usually slpolice is run with a crontab entry similar to:

5 8-21 * * * /usr/local/bin/slpolice --cpu_min 120 --reserved_min 120 >/dev/null 2>&1

This sends warnings each hour after 2 hours of CPU time. It does not check at night so that long overnight jobs will not receive warnings.

ARGUMENTS

--help

Displays this message and program version and exits.

--port <portnumber>

Specifies the port number that slchoosed uses.

--dhost <hostname>

Specifies the host name that slchoosed uses. May be specified multiple times to specify backup hosts. Defaults to SLCHOOSED_HOST environment variable, which contains colon separated host names.

--cpu_min

Number of cpu minutes the job should have before being reported to the user. Defaults to 0, which is off.

--renice_min

Number of minutes after which the nice value of a high cpu using process is not 10 is reniced to 19. Defaults to 0, which is off.

--reserved_min

Number of minutes a host may be reserved before reporting it to the user. Defaults to 0, which is off.

--version

Displays program version and exits.

SEE ALSO

nicercizerd, Schedule::Load

This program is most valuable when used with the nicercizerd program, or a operating system where nice 19 processes get only leftover cpu resources. It requires a program called nice19 which is a version of nice that is setgid root and renices a job to 19. This comes with nicercizerd.

DISTRIBUTION

This package is distributed via CPAN.

Nicercizerd is available from http://veripool.com.

AUTHORS

Wilson Snyder <wsnyder@wsnyder.org>