GWME-7.2.1-04 SLA and Downtime Scheduler Update

Contents
Description

This technical bulletin is for all GroundWork customers. It provides an update to the SLA processing and dashboards (for enterprise customers) and the downtime scheduler "slareport" database. It is applicable to GroundWork Monitor 7.2.1 only, and requires the application of the GWME-7.2.1-00 Rollup Patch Installer.

UPDATED 11/08/2018: The 2018-10-31 v2 version of this TB contained the original version of the retention scripting, which did not support retention control of audittrail data and did not match the install instructions here. Also, that version did not verify that the patch is being installed over the GWME-7.2.1-00 Rollup Patch. The v2 version has been replaced with a v3 tarball. It now matches the install instructions, supports retention control of audittrail data, verifies that the rollup patch is in place before this SLA patch can be applied, and does proper state reversion if either install or uninstall of the patch fails. If you had either the original version or the v2 version of this SLA patch applied, it is recommended that you uninstall it, verify that you have the rollup patch installed, and then install the v3 version of this SLA patch.

What is in this update?

This patch will add a cron job to remove historical expired downtimes from the database. By default only downtimes expired for more than 90 days will be removed, but you have to enable this. By default sufficiently ancient SLA audittrail data will also be deleted, with the default value being 90 days as well.

See install steps below.

Prerequisites

This technical bulletin does not depend upon prior technical bulletins. There is no need to schedule a downtime to apply this technical bulletin. There is no need to halt GroundWork Monitor to run the script, although you should be prepared to see the initial running of the script take longer to complete depending on the number of downtime requests you have accumulated in the system.

You can learn what version and patch if any are installed by the following command line as the root or nagios user

cat /usr/local/groundwork/Info.txt

Typing that should get you a text page like the next one, in which you can see that you have the right version and patch. If you don't see something like it you have some work to do getting upgraded and patched, give Support a call for assistance.

cat /usr/local/groundwork/Info.txt
GroundWork Installation Report
==============================
Components Installed:
PostgreSQL ..... 9.6.9
Monitor    ..... 7.2.1-br494-gw3901
Product Info
=============
name= enterprise
version= 7.2.1
bitrock_build= br494
gw_build= gw3901
installer_name= groundworkenterprise-7.2.1-br494-gw3901-linux-64-installer.run
installation_date= 11/02/2018 21:36 PM
patch_version= 7.2.2
patch_gw_build= gw4042
patch_installation_date= 11/07/2018 10:10 PM
Installation Steps
Name Size Creator Creation Date Comment  
TB7.2.1-04-sla-updates-v3.tgz 9 kB Hans Kriel Nov 09, 2018 14:36 MD5: 8589050fbbe89a3814366a542107965b  
  1. As root user, copy the attached tgz to the /tmp directory on the GroundWork server.
  2. Unpack the tar using the command:
    tar zxvf TB7.2.1-04-sla-updates-v3.tgz
    
  3. Change into the directory created by untarring:
    cd TB7.2.1-04-sla-updates
    
  4. Run the installer script, and respond to the prompt:
    ./TB7.2.1-04_install.sh
    

    The install script will test that you are on a 7.2.1 system (which is required), that you have installed the 7.2.1 Rollup Patch, that you have not already installed this patch, and that you want to go ahead.

    A backup will be created for rollback in case the install is interrupted. A new maintenance script and configuration file will be copied into place. The related cron job will be modified so it doesn't run if a previous execution is still running. A new cronjob will be added, to impose retention control on SLA downtime and audit-trail data. Only the cron job will be affected.

When does the cron job run? At 40 minutes past midnight every night. Here is the line that we add to the nagios crontab.

40 0 * * * /usr/local/groundwork/tools/sla_retention --downtimes --audittrail
Usage

There are no special usage instructions, but if you want to get the benefit of the trimmed database right away, you can follow these steps:

  1. (Optional) Edit the file:
    /usr/local/groundwork/config/sla_retention.conf
    

    and change the days value from the default of 90 to the desired number of days to retain expired downtime and audit trail information.

  2. As the nagios user, run:
    /usr/local/groundwork/tools/sla_retention --downtimes --audittrail
    

    This may take a minute or two.

  3. (Optional) Clean up the database after deleting records by:
    1. Logging in as the root (or superuser).
    2. Type:
      /usr/local/groundwork/postgresql/bin/psql
      

      and supply the postgres user password.

    3. Enter at the sql prompt:
      \c slareport
      vacuum (full, analyze);
      
Uninstalling

Reversal of this patch will remove the new files, and it's best to put back the original cron job setup. Monitoring in general can continue while this happens. If you had previously modified the frequency of the yiic downtime process cronjob, you will want to edit that again after uninstalling this patch.

  1. As the root user, navigate to the patch-unpack directory you created when installing the patch, (if that directory no longer exists, first unpack the patch again):
    cd TB7.2.1-04-sla-updates
    
  2. Run the uninstall command:
    ./TB7.2.1-04_uninstall.sh