NAME
Helios::Configuration - Helios configuration parameter reference
DESCRIPTION
The Helios system defines a large number of configuration parameters. Some of these affect the operation of the helios.pl daemon, others affect worker processes, and some can affect both. Aside from these reserved parameter names, the configuration parameters your Helios service uses are largely up to you.
Helios configuration parameters that affect worker process launching and management are usually in ALL CAPS. This helps set them apart from other application-level parameters.
helios.pl PARAMETERS
Collective Database Configuration Parameters
These are the most important parameters in helios.ini. They must be declared in the [global] section. Without them, helios.pl will be unable to connect to the collective database and will fail to start.
dsn
dsn=dbi:Oracle:SHARDEV
The dsn parameter is the DBI datasource name of the Helios collective database.
user
user=scott
The user parameter is the user to use when connecting to the Helios collective database.
password
password=tiger
The password parameter is the password to use when connecting to the Helios collective database.
options
options=private_option=>'string',private_option2=>'another string'
The options parameter is used when special DBI options are needed when connecting to the Helios collective database. Normally, this parameter should not be necessary but is made available for users who may need to specify special parameters in their database connections anyway.
Other [global] Parameters
pid_path
pid_path=/home/helios/run
Sets the path where helios.pl daemon will write its PID files. This should be an absolute path to a directory writable by the Helios user (the user the helios.pl daemon will run as). Each helios.pl service daemon will write a PID file incorporating the name of the service class it has loaded into this directory.
If this directory does not exist or is not writable by the Helios user, the helios.pl daemon will fail to start.
DEFAULT: /var/run/helios
registration_interval
The number of seconds to wait before a helios.pl service daemon "checks in" to the collective database. Periodically helios.pl will update a table in the Helios collective database for monitoring purposes. This allows the Helios::Panoptes admin console to provide the Collective Admin view, and enables Panoptes and other utilities to see if a helios.pl service daemon has crashed or has encountered some other type of error. The default 60 seconds should be fine for most purposes, but can be increased to reduce database load if necessary.
DEFAULT: 60
Service-specific Tuning Parameters for helios.pl
There are several parameters useful for tuning the helios.pl service daemon to work better with your Helios service. Helios and the helios.pl daemon default to behavior that should work well for processing jobs that last a short amount of time (generally 30 seconds or less). If your jobs consistently last longer than a minute, or can potentially put a strain on resources like a database or a file server, you may wish to adjust the following parameters.
These parameters are not dynamic and should be set in the Helios conf file, either in the [global] section or a section matching your service's name.
master_launch_interval
master_launch_interval=5
The master_launch_interval is the amount of time in seconds helios.pl waits after launching workers before it attempts to launch workers again. Normally the default of 1 second is fine, but if you need to slow how quickly new worker processes are started, you can increase this number.
DEFAULT: 1
zero_launch_interval
zero_launch_interval=30
The zero_launch_interval is the amount of time in seconds helios.pl waits to launch workers again after the MAX_WORKERS limit has been reached. Once helios.pl launches the MAX_WORKERS number of workers, it will not launch more even if there are available jobs in the queue. If a particular service's jobs usually take longer than the default of 10 seconds, or you are using OVERDRIVE mode so your worker processes persist until no more jobs available, increasing zero_launch_interval may decrease needless database queries. For most cases, the default of 10 seconds should be adequate.
DEFAULT: 10
zero_sleep_interval
zero_sleep_interval=20
The zero_sleep_interval is the amount of time between checks for available jobs in the job queue when the job queue is empty. If the helios.pl daemon determines there are no available jobs for a service, it sleeps zero_sleep_interval seconds and then checks again. If there are available jobs, it starts to launch workers; if there are still none, it sleeps another zero_sleep_interval seconds and checks again. This can cause jobs to "sit" in the queue for some seconds before workers are launched to service them. If you do not have enough jobs consistently entering the job queue to keep workers running in OVERDRIVE mode, decreasing this number will make helios.pl more responsive by launching workers for your jobs sooner (at the expense of extra repeated queries of the job queue in the database). If your jobs can wait in the job queue for awhile and you do not have many entering the system, increasing this number can reduce the number of needless queries to your database.
DEFAULT: 10
WORKER PROCESS MANAGEMENT
The following configuration parameters affect the management of workers and how they run services and process jobs. These are most typically set in the collective database configuration table for each service, thus they are ALL CAPS to separate them from your services' own configuration parameters. Unlike the parameters in the previous section, these configuration parameters are dynamic and can be changed via Helios::Panoptes, the helios_config_* shell commands, or SQL commands to your collective database.
HOLD
HOLD=1
Puts a Helios service in Hold Mode. All worker processes shut down after finishing the current job, and the helios.pl service daemon ignores avaliable jobs in the job queue. Set HOLD to 0 or delete it from the configuration to cause Helios to return to Normal Mode.
DEFAULT: 0
HALT
HALT=1
Causes a helios.pl service daemon and all its workers to shutdown. When HALT is set for a service, worker processes exit after the current job is finished. The helios.pl service daemon waits MAX_WORKER_TTL_WAIT_INTERVAL seconds for workers to finish, and sends any remaining workers a SIGKILL signal to eliminate any stragglers. The daemon then removes its registration entry from the collective database and exits.
Warning: BE CAREFUL about setting a HALT for a service for all hosts (hostname='*'). This will shutdown all instances of that Helios service in the ENTIRE collective, and the only way to restart them is to log into the host and start them manually. If you need to perform maintenance on hosts in a production Helios collective, you most likely want to HOLD all instances of a service and then HALT each instance individually as needed.
DEFAULT: none (the presence of HALT in the config causes a shutdown regardless of its value)
MAX_WORKERS
MAX_WORKERS=10
Along with OVERDRIVE, MAX_WORKERS is the most powerful configuration parameter in the Helios framework. Setting MAX_WORKERS allows a helios.pl service daemon to launch multiple workers at a time to service jobs, up to the MAX_WORKERS limit.
Normally, when the helios.pl service daemon sees available jobs in the job queue, it starts to launch worker processes to service the jobs. Normally, it launches workers gradually, one at a time, in order to prevent overtaxing resources (and to allow the launched workers time to do actually run the jobs). If there are the same or more jobs in the queue as the MAX_WORKERS value, helios.pl will "blitz" (launch the maximum amount of workers) to attempt to run the jobs in the queue as quickly as possible. This "blitzing" feature is controlled by the WORKER_BLITZ_FACTOR parameter, so if you want want Helios to blitz workers before there are that many jobs available in the queue, adjust WORKER_BLITZ_FACTOR downward to allow helios.pl to launch more worker processes faster.
DEFAULT: 1
OVERDRIVE
OVERDRIVE=1
Setting OVERDRIVE causes a worker process to persist in memory continuing to run jobs from the job queue until all available jobs for the loaded service are exhausted. Coupled with MAX_WORKERS, allows you to maximize job throughput by eliminating repeated process startup procedures and enabling caching of database connections and other data structures.
Unless your service is designed to run long-running jobs lazily, you almost certainly want to set OVERDRIVE to 1. It is set to 0 by default because indiscriminately running untested, potentially unsafe services can cause unexpected, even disasterous behavior. Make sure your service runs in Normal Mode first, then test it in Overdrive Mode throughly before you deploy it.
DEFAULT: 0
WORKER_LAUNCH_PATTERN
WORKER_LAUNCH_PATTERN=prog
#[]
- cons
- prog
- oppt
The default for Helios 2.8 is 'cons', but this is likely to change in the future.
DEFAULT: cons (workers are launched one at a time when jobs are available)
PRIORITIZE_JOBS
PRIORITIZE_JOBS=low
#[]
DEFAULT: 0 (jobs are pulled from the database at random)
LAZY_CONFIG_UPDATE
LAZY_CONFIG_UPDATE=1
Use LAZY_CONFIG_UPDATE to increase worker process performance by reducing the number of configuration parameter refreshes a worker process performs in Overdrive Mode. In Overdrive Mode, a worker process refreshes the service configuration from the collective database just before it calls the service's run() method. With LAZY_CONFIG_UPDATE set to 1, this configuration refresh is performed only before every 10th job the worker process runs, reducing database queries and thus increasing performance.
NOTE: The configuration refresh is where worker processes pick up HOLD and HALT parameters, so using LAZY_CONFIG_UPDATE will cause worker processes to be less responsive when holding jobs or halting the service. If your service's configuration does not change often, you can activate LAZY_CONFIG_UPDATE and see if your service experiences a noticable increase.
DEFAULT: 0
WORKER_MAX_TTL
WORKER_MAX_TTL=300
The maximum amount of time in seconds to allow a worker process for a service to run. If a worker process continues to run past this threshold, the helios.pl service daemon will assume it has become stuck in some way and will send it a SIGKILL signal (9) to kill it (real world situations have shown softer signals are unreliable in such situations). If you set this and find worker processes not experiencing problems are being unnecessarily killed, you may need to increase the WORKER_MAX_TTL_WAIT_INTERVAL (below).
DEFAULT: none; workers running in Normal Mode run until their job is complete; workers in Overdrive Mode work until no more jobs are available in the job queue.
WORKER_MAX_TTL_WAIT_INTERVAL
Number of seconds a helios.pl service daemon will wait for a worker that has reached its WORKER_MAX_TTL to exit. If a worker process continues running past WORKER_MAX_TTL + WORKER_MAX_TTL_WAIT_INTERVAL seconds, helios.pl will assume the worker process has hung in some way and will send it a SIGKILL (9) signal to kill it.
DEFAULT: 30
DOWNSHIFT_ON_NONZERO_RUN
DOWNSHIFT_ON_NONZERO_RUN=1
This to support certain legacy behaviors for Helios services developed before Helios 2.40. You almost certainly do not need to set this.
DEFAULT: 0 (ignore the return value of the service's run() method)
LOGGING
loggers
loggers=HeliosX::Logger::Syslog,HeliosX::Logger::Log4perl
Specify a comma-separated list of external logging classes to use to log information. Each of the modules listed should implement the Helios::Logger interface class.
Each logger class likely will have its own configuration parameters; see the logger's documentation for the appropriate configuration information.
The Helios internal logger (Helios::Logger::Internal) is automatically added to this list, unless internal_logger (below) is turned off.
DEFAULT: None
internal_logger
internal_logger=off
Whether the Helios internal logger (Helios::Logger::Internal) should be used to log information. The internal logger logs information to a table in the Helios collective database, and is the log system used by the Helios::Panoptes System Log view. If you want to only use an external logging system such as HeliosX::Logger::Log4perl, you can turn off Helios logging completely by setting internal_logger to 0 or 'off'.
DEFAULT: on
log_priority_threshold
log_priority_threshold=5
The log level above which the internal logger discards log messages. Specifying a log_priority_threshold will cause log messages of a lower priority (higher numeric value) to be discarded. For example, a log_priority_threshold of 6 (LOG_INFO) will cause log messages with a priority of 7 (LOG_DEBUG) to be discarded.
See the Helios::Logger::Internal documentation for more information on log thresholds.
DEFAULT: undefined (all log messages are logged)
AUTHOR
Andrew Johnson, <lajandy at cpan dot org>
COPYRIGHT AND LICENSE
Copyright (C) 2012-3 by Logical Helion, LLC.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.0 or, at your option, any later version of Perl 5 you may have available.
WARRANTY
This software comes with no warranty of any kind.