Contents
Background
One of the challenges of monitoring is avoiding receiving too many alarms for a particular incident. While receiving multiple alarms for different services affected by one failure can help you understand the magnitude or impact of an incident, more often they can distract you from focusing on the cause of the problem, and over time, they can lead to alarm fatigue.
Reducing alarms is possible by setting dependencies between hosts and services, and it can provide very effective alarm reduction if you set up dependencies between network devices or network links and hosts that are connected to those devices or links. There are often numerous routers, switches, and firewalls to consider. GroundWork Monitor provides multiple methods for monitoring such network devices and links. One such method is to poll the devices with the Simple Network Management Protocol (SNMP). GroundWork Monitor Enterprise 6.3 provides polling via SNMP either by using Nagios plugins, or the Cacti network traffic graphing tool.
This article describes how to set up monitoring of links for their up or down status on network devices using Cacti. Once links are being monitored in this way, the alarms can be fed into the Nagios alarm conditioning engine to suppress alarms from hosts and services that depend on these network devices and links being up.
Overview of using Cacti
The approach that is described here is to use Cacti to monitor the SNMP parameter called ifOperStatus for selected interfaces, and then to define alarm thresholds in the Cacti thold plugin which will produce alarms when the selected interfaces change from an up state to a down state. The process is straightforward for those who have familiarized themselves with Cacti configuration tasks. If you have not done so, we recommend signing up for GroundWork's on-demand Cacti training module.
Configuration steps
Importing updated data query template
In order for Cacti to know how to monitor the ifOperStatus parameter for an interface, a template must be imported that describes how interfaces and their parameters are laid out in the SNMP space. We have provided an update to an existing data query template that includes this information. After saving the template to your desktop you can import it into Cacti using the GroundWork Monitor portal. Doing so will update the existing SNMP - Interface Statistics data query template.
Note: If you have made changes to this template in your installation of GroundWork Monitor then you may need to re-apply those changes after this procedure. |
- Download this updated SNMP - Interface Statistics data query template.
- Log in to the GroundWork Monitor portal and navigate to Advanced > Network Graphing > console > Import Templates (Note that this is the default Advanced view).
- Click the Browse... button next to the Import Template from Local File option and select the data template you previously downloaded.
- Click the Import button to perform the import operation. If the operation is successful then you should see Import Results as follows:
Adding graphs to devices
Now that the data query template is updated to include the Interface Operational status graph template, you can add this monitoring to your devices. The on-demand Cacti training module walks through the steps of creating graphs for new and existing devices. If you just need a pointer or two, and can learn by trial and error, then follow these steps:
- Navigate to Advanced > Network Graphing > console > Devices.
- Select an existing device by its hyperlinked Description, or Add a new device.
- Check that the SNMP - Interface Statistics Data Query has been added to the Associated Data Queries section and that the Status for the data query shows as a Success.
- Click the Create Graphs for this Host and then you should see a section titled Data Query [SNMP - Interface Statistics].
- Choose Interface Operational status from the Select a graph type: pull down menu and then check boxes next to the relevant interfaces. Click the create button to finalize this.
Adding a threshold template
With device interfaces now being monitoring for their status, you need to set up a threshold template to alarm on a down status. The on-demand Cacti training module walks through the steps of creating threshold templates but we have briefly documented the approach here:
- Navigate to _Advanced > Network Graphing > console > Threshold Templates.
- Click the Add button to define a new template.
- Select Interface - Status for the Data Template and then int_status for the Data Source, and click create.
- Based on the defaults, type 1.5 into the High Threshold setting, and click the Save button. The usual numeric value of ifOperstatus for an UP interface is 1. If the interface goes DOWN, the value is usually 2. Thus, a threshold of 1.5 will cause the threshold to be exceeded when the interfaces changes from UP to DOWN.
Applying a threshold template to a device
Now the threshold template has been created, you need to assign it to all of the relevant Interface Operational status graphs you created. This is very easy!
- Navigate to Advanced > Network Graphing > console > Devices.
- Click the check box next to the device you created graphs for, and then select Apply Thresholds from the Choose an action: pull down. Click Go. Confirm the operation when prompted.
If you followed all the steps correctly then you will now have thresholds defined for the status of your important network interfaces. If you look at the Advanced > Network Graphing > thold tab and select display of All threshold statuses, you should see something like the following:
Conclusion
GroundWork Monitor Enterprise Edition 6.3 has Cacti integrated such that threshold alarms are fed into the Nagios subsystem as passive service check results. GroundWork provides a service profile with the correct passive service check already defined. Following directions in the built-in Help documentation, you can import that service profile and assign it to all of your network devices to have that passive service check ready for receiving alarms from Cacti (Home > GROUNDWORK PROFILES > Importing Profiles).
Once alarms are coming in to the Nagios subsystem for network interface failures, you can set up dependencies between those alarms and other dependent services and hosts. The methods to set up those dependencies are also documented in the built-in Help documentation (Home > USING APPLICATIONS > Configuration > Configuration Scenarios > About Dependencies).