Problem
Certain messages from Nagios can contain unmapped values related to state, with the result being failures when the status and event feeders form updates to Foundation (values are not recognized as valid data). The failures can be detected in the framework log. Some amount of overhead is incurred, as well as potential interference with other coincident activities at the Java, caching, and database levels.
An additional condition can occur in the use case where a Child Server is transmitting status and events to the Parent. The error is an incorrect time stamp recorded by the Parent, as the current time, rather than the actual time from the originating service check (for example a GDMA check reported to the Child). The failure is in the way timestamp sending is specified, leaving an open case where the Parent is given "current time" rather than what the per host or service reported.
Finally, empty timestamps in Foundation could cause a repeated failure of the feeder, when that empty value is offered to the feeder.
Solution
This patch includes repairs for all these conditions, which have been checked in to the trunk source and are available here as a patch to 7.2.1. When an upgrade is made to a 7.2.1 system this patch will not cause the upgrade to fail. Jiras identified in the repair process include GWMON-13428, GWMON-13429, GWMON-13419, and GWMON-13412.
Installation Steps
- As root user, copy the attached tar to an empty directory on the server to be patched.
- Unpack the tar using the command:
tar xf TB7.2.1-02-nagios-feeders.tgz
- Change into the directory created by untarring:
cd TB7.2.1-02-nagios-feeders
- Run the installer script:
./TB7.2.1-02_install.sh
The install script will test you are on a 7.2.1 system (required), that you have not already installed this patch, and that you want to go ahead. A backup will be created for uninstallation, the two feeder Perl scripts will be copied in to place, and gwservices will be restarted giving you a short (3 minute) outage of portal access (monitoring will continue undisturbed).
- The timestamp issue requires a setting:
send_actual_check_timestamps = true
which is not in the status-feeder.properties file. This is allowed for ease of upgrade, the status feeder on not seeing that option will assign it the default value (true) to match the most common use case. Users wishing to set the value to false (forces all timestamps to be whatever time the server decides is correct) must add that line to their local status-feeder.properties and restart gwservices.
Uninstalling
- As root user, navigate to the unpack directory created when installing the patch.
- Run the uninstall command:
./TB7.2.1-02_uninstall.sh
The two files installed (feeders) will be replaced with those saved by the installer backup. GroundWork gwservices will be restarted, incurring a short (3 minute) interruption in portal access (monitoring will continue uninterrupted).