NAME

AnyEvent::Watchdog::Util - watchdog control and process management

SYNOPSIS

use AnyEvent::Watchdog::Util;

DESCRIPTION

This module can control the watchdog started by using AnyEvent::Watchdog in your main program, but it has useful functionality even when not running under the watchdog at all, such as program exit hooks.

VARIABLES/FUNCTIONS

The module supports the following variables and functions:

AnyEvent::Watchdog::Util::enabled

Return true when the program is running under the regime of AnyEvent::Watchdog, false otherwise.

AnyEvent::Watchdog::Util::enabled
   or die "watchdog not enabled...";
AnyEvent::Watchdog::Util::restart;

Note that if it returns defined, but false, then AnyEvent::Watchdog is running, but you are in the watchdog process - you probably did something very wrong in this case.

AnyEvent::Watchdog::Util::restart_in [$timeout]

Tells the supervisor to restart the process when it exits (enable autorestart), or forcefully after $timeout seconds (minimum 1, maximum 255, default 60).

This function disables the heartbeat, if it was enabled. Also, after calling this function the watchdog will ignore any further requests until the program has restarted.

Good to call before you intend to exit, in case your clean-up handling gets stuck.

AnyEvent::Watchdog::Util::restart [$timeout]

Just like restart_in, but also calls exit 0. This means that this is the ideal method to force a restart.

AnyEvent::Watchdog::Util::autorestart [$boolean]
use AnyEvent::Watchdog autorestart => $boolean

Enables or disables autorestart (initially disabled, default for $boolean is to enable): By default, the supervisor will exit if the program exits or dies in any way. When enabling autorestart behaviour, then the supervisor will try to restart the program after it dies.

Note that the supervisor will never autorestart when the child died with SIGINT or SIGTERM.

AnyEvent::Watchdog::Util::heartbeat [$interval]
use AnyEvent::Watchdog heartbeat => $interval

Tells the supervisor to automatically kill the program if it doesn't react for $interval seconds (minium 1, maximum 255, default 60) , then installs an AnyEvent timer the sends a regular heartbeat to the supervisor twice as often.

Exit behaviour isn't changed, so if you want a restart instead of an exit, you have to call autorestart.

The heartbeat frequency can be changed as often as you want, an interval of 0 disables the heartbeat check again.

AnyEvent::Watchdog::Util::on_exit { BLOCK; shift->() }

Installs an exit hook that is executed when the program is about to exit, while event processing is still active to some extent.

The hook should do whatever it needs to do (close active connections, disable listeners, write state, free resources etc.). When it is done, it should call the code reference that has been passed to it.

This means you can install event handlers and return from the block, and the program will not exit until the callback is invoked.

Exiting "the right way" is surprisingly difficult. This is what on_exit does:

It installs watchers for SIGTERM, SIGINT, SIGXCPU and SIGXFSZ, and well as an END block (the END block is actually registered in AnyEvent::Watchdog, if possible, so it executes as late as possible). The signal handlers remember the signal and then call exit, invoking the END callback.

The END block then checks for an exit code of 255, in which case nothing happens (255 is the exit code that results from a program error), otherwise it runs all on_exit hooks and waits for their completion using the event loop.

After all on_exit hooks have finished, the program will either be exited with the relevant status code (if exit was the cause for the program exit), or it will reset the signal handler, unblock the signal and kill itself with the signal, to ensure that the exit status is correct.

If the program is running under the watchdog, and autorestart is enabled, then the heartbeat is disabled and the watchdog is told that the program wishes to exit within 60 seconds, after which it will be forcefully killed.

All of this should ensure that on_exit hooks are only executed when the program is in a sane state and data structures are still intact. This only works when the program does not install it's own TERM (etc.) watchers, of course, as there is no control over them.

There is currently no way to unregister on_exit hooks.

SEE ALSO

AnyEvent.

AUTHOR

Marc Lehmann <schmorp@schmorp.de>
http://home.schmorp.de/