Miscellanous Settings
Program mode
Timing interval length
Agressive host checking option
Method to use to determine time between checks
Global host eventhandler
Global service Eventhandler
Inter-check sleep time
Service check interleave factor
Maximum concurrent checks
Service reaper frequency
Format: |
program_mode=<a/s> |
Example: |
program_mode=a |
Format: |
interval_length=<seconds> |
Example: |
interval_length=60 |
This is the number of seconds per "unit interval" used for timing in the scheduling queue,
re-notifications, etc. "Units intervals" are used in the host configuration file to determine
how often to run a service check, how often of re-notify a contact, etc.
Important: The default value for this is set to 60, which means that a "unit value"
of 1 in the host configuration file will mean 60 seconds (1 minute). I have not really tested other
values for this variable, so proceed at your own risk if you decide to do so!
Agressive Host Checking Option |
Format: |
use_agressive_host_checking=<0/1> |
Example: |
use_agressive_host_checking=0 |
Beginning with release 0.0.4, NetSaint tries to be a little smarter about how and when it checks the status of hosts. In general, disabling this option will allow NetSaint to make some smarter decisions and check hosts a bit faster. Enabling this option will increase the amount of time required to check hosts, but may improve reliability a bit. If you want to know more about exactly what this option does, search the source code in the netsaint.c file for the string "use_agressive_host_checking" and read some of the comments I've added. Unless you have problems with NetSaint not recognizing that a host recovered, I would suggest not enabling this option.
- 0 = Don't use agressive host checking (default)
- 1 = Use agressive host checking
Format: |
inter_check_delay_method=<n/d/s> |
Example: |
inter_check_delay_method=s |
This option allows you to control how service checks are initially "spread out" in the event queue. Using a "smart" delay calculation (the default) will cause NetSaint to calculate an average check interval and spread initial checks of all services out over that interval, thereby helping to eliminate CPU load spikes. Using no delay is generally not recommended unless you are testing the service check parallelization functionality. Using no delay will cause all service checks to be scheduled for execution at the same time. This means that you will generally have large CPU spikes when the services are all executed in parallel. Values are as follows:
- n = Don't use any delay - schedule all service checks to run immediately (i.e. at the same time!)
- d = Use a "dumb" delay of 1 second between service checks
- s = Use a "smart" delay calculation to spread service checks out evenly (default)
Global Host Event Handler Option |
Format: |
global_host_event_handler=<command> |
Example: |
global_host_event_handler=log-host-event-to-db |
This option allows you to specify a host event handler command that is to be run for every host state change. The global event handler is executed immediately prior to the event handler that you have optionally specified in each host definition. The command argument is the short name of a command definition that you define in your host configuration file. More information on event handlers can be found here.
Global Service Event Handler Option |
Format: |
global_service_event_handler=<command> |
Example: |
global_service_event_handler=log-service-event-to-db |
This option allows you to specify a service event handler command that is to be run for every service state change. The global event handler is executed immediately prior to the event handler that you have optionally specified in each service definition. The command argument is the short name of a command definition that you define in your host configuration file. More information on event handlers can be found here.
Format: |
sleep_time=<seconds> |
Example: |
sleep_time=1 |
This is the number of seconds that NetSaint will sleep before checking to see if the next service check in the
scheduling queue should be executed. Note that NetSaint will only sleep after it "catches up" with queued
service checks that have fallen behind.
Service Interleave Factor |
Format: |
service_interleave_factor=<s|n> |
Example: |
service_interleave_factor=s |
This variable determines how service checks are interleaved. Interleaving allows for a more even distribution of service checks, reduced load on remote hosts, and faster overall detection of host problems. With the introduction of service check parallelization, remote hosts could get bombarded with checks if interleaving was not implemented. This could cause the service checks to fail or return incorrect results if the remote host was overloaded with processing other service check requests. Setting this value to 1 is equivalent to not interleaving the service checks (this is how versions of NetSaint previous to 0.0.5 worked). Set this value to s (smart) for automatic calculation of the interleave factor unless you have a specific reason to change it. The best way to understand how interleaving works is to watch the status CGI (detailed view) when NetSaint is just starting. You should see that the service check results are spread out as they begin to appear.
- n = A number greater than or equal to 1 that specifies the interleave factor to use. An interleave factor of 1 is equivalent to not interleaving the service checks.
- s = Use a "smart" interleave factor calculation (default)
Maximum Concurrent Service Checks |
Format: |
max_concurrent_checks=<max_checks> |
Example: |
max_concurrent_checks=20 |
This option allows you to specify the maximum number of service checks that can be run in parallel at any given time. Specifying a value of 1 for this variable essentially prevents any service checks from being parallelized. You'll have to modify this value based on the system resources you have available on the machine that runs NetSaint, as it directly affects the maximum load that will be imposed on the system (processor utilization, memory, etc.).
Format: |
service_reaper_frequency=<frequency_in_seconds> |
Example: |
service_reaper_frequency=10 |
This option allows you to control the frequency in seconds of service "reaper" events. "Reaper" events process the results from parallelized service checks that have finished executing. These events consitute the core of the monitoring logic in NetSaint.