====== Nagios Monitoring - Knürr / Emerson CoolLoop ======
We use [[http://www.emersonnetworkpower.com/en-EMEA/Products/RACKSANDINTEGRATEDCABINETS/RackCooling/Pages/KnurrCoolLoop10to30kWCoolingPower.aspx|Knürr (now Emerson) CoolLoop]] units to chill the 19" equipment in the datacenters. The CoolLoop units come with their own CoolCon controller units for management and monitoring purposes. The CoolCon controllers can - similar to the [[2012:10:23:nagios_monitoring_rittal_cmc_lcp|Rittal CMC-TC]] - be queried via SNMP for the status and values of the environmental sensors. The kind and number of sensors depends on specific configuration that was ordered. It ranges from simple fan, air temperature and water valve sensors in the basic setup to humidity, additional temperature, water flow, water temperature and electrical current, voltage and energy sensors in the extended setup. To monitor those sensor status and environmental values provided by the CoolCon controller i wrote a Nagios plugin ''check_knuerr_coolcon.sh''. In order to run the Nagios plugin, you need to have SNMP activated on the CoolCon controller unit and a network connection from the Nagios system to the CoolCon controller unit on port UDP/161 must be allowed.
The whole setup for monitoring Knürr / Emerson CoolLoop - and possibly, but untested, also CoolTherm - units with Nagios looks like this:
- Enable SNMP queries on the CoolCon controller unit. Verify the port **UDP/161** on the CoolCon controller unit can be reached from the Nagios system. <
- **Optional**: Enable SNMP traps to be sent to the Nagios system on the CoolCon controller unit. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Verify the port **UDP/162** on the Nagios system can be reached from the CoolCon controller unit. <
- Download the {{:2014:01:14:check_knuerr_coolcon.sh|Nagios plugin check_knuerr_coolcon.sh}} and place it in the plugins directory of your Nagios system, in this example ''/usr/lib/nagios/plugins/'':
$ mv -i check_knuerr_coolcon.sh /usr/lib/nagios/plugins/
$ chmod 755 /usr/lib/nagios/plugins/check_knuerr_coolcon.sh
<
- Define the following Nagios commands. In this example this is done in the file ''/etc/nagios-plugins/config/check_coolcon.cfg'':
# check Knuerr CoolCon/CoolLoop energy status
define command {
command_name check_coolcon_energy
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C energy
}
# check Knuerr CoolCon/CoolLoop fan status
define command {
command_name check_coolcon_fan
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C fan
}
# check Knuerr CoolCon/CoolLoop humidity status
define command {
command_name check_coolcon_humidity
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C humidity
}
# check Knuerr CoolCon/CoolLoop temperature status
define command {
command_name check_coolcon_temperature
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C temperature
}
# check Knuerr CoolCon/CoolLoop valve status
define command {
command_name check_coolcon_valve
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C valve
}
# check Knuerr CoolCon/CoolLoop waterflow status
define command {
command_name check_coolcon_waterflow
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C waterflow
}
# check Knuerr CoolCon/CoolLoop watertemperature status
define command {
command_name check_coolcon_watertemperature
command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C watertemperature
}
<
- Define a group of services in your Nagios configuration to be checked for each CoolLoop system:
# check snmpd
define service {
use generic-service
hostgroup_name coolcon
service_description Check_SNMPDv2
check_command check_snmpdv2
}
# check_coolcon_energy
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Energy
check_command check_coolcon_energy
}
# check_coolcon_fan
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Fan
check_command check_coolcon_fan
}
# check_coolcon_humidity
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Humidity
check_command check_coolcon_humidity
}
# check_coolcon_temperature
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Temp
check_command check_coolcon_temperature
}
# check_coolcon_valve
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Valve
check_command check_coolcon_valve
}
# check_coolcon_waterflow
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_Waterflow
check_command check_coolcon_waterflow
}
# check_coolcon_watertemperature
define service {
use generic-service-pnp
hostgroup_name coolcon
service_description Check_CoolCon_WaterTemp
check_command check_coolcon_watertemperature
}
Replace ''generic-service'' with your Nagios service template. Replace ''generic-service-pnp'' with your Nagios service template that has performance data processing enabled. <
- Define a service dependency to run the above checks only if the ''Check_SNMPDv2'' was run successfully:
# Knuerr CoolCon SNMPD dependencies
define servicedependency {
hostgroup_name coolcon
service_description Check_SNMPDv2
dependent_service_description Check_CoolCon_.*
execution_failure_criteria c,p,u,w
notification_failure_criteria c,p,u,w
}
<
- Define hosts in your Nagios configuration for each CoolLoop device. In this example its named ''coolcon1'':
define host {
use coolcon
host_name coolcon1
alias Knuerr CoolCon CoolLoop 1
address 10.0.0.1
parents parent_lan
}
Replace ''coolcon'' with your Nagios host template for the CoolCon controller units. Adjust the ''address'' and ''parents'' parameters according to your environment. <
- Define a hostgroup in your Nagios configuration for all CoolLoop devices. In this example it is named ''coolcon''. The above checks are run against each member of the hostgroup:
define hostgroup {
hostgroup_name coolcon
alias Knuerr CoolCon/CoolLoop
members coolcon1
}
<
- Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
$ /etc/init.d/nagios3 reload
<
The new hosts and services should soon show up in the Nagios web interface.
If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from CoolCon controller units. This can be achieved by the following steps:
- Request a current version of the CoolCon SNMP MIB file from Knürr / Emerson. In this example it's ''080104140000010a_KNUERR-COOLCON-MIB-V10.mib''. Transfer the file ''080104140000010a_KNUERR-COOLCON-MIB-V10.mib'' to the Nagios server. <
- Convert the SNMP MIB definitions in ''080104140000010a_KNUERR-COOLCON-MIB-V10.mib'' into a format that SNMPTT can understand.
$ /opt/snmptt/snmpttconvertmib --in=MIB/080104140000010a_KNUERR-COOLCON-MIB-V10.mib --out=/opt/snmptt/conf/snmptt.conf.knuerr-coolcon
...
Done
Total translations: 201
Successful translations: 201
Failed translations: 0
<
- Edit the trap severity according to your requirements, e.g.:
$ vim /opt/snmptt/conf/snmptt.conf.knuerr-coolcon
...
EVENT fans .1.3.6.1.4.1.2769.2.1.5.0.1 "Status Events" Warning
...
<
- Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:
$ vim /opt/snmptt/snmptt.ini
...
[TrapFiles]
snmptt_conf_files = < <
- Download the {{:2012:10:02:check_snmp_traps.sh|Nagios plugin check_snmp_traps.sh}} and place it in the plugins directory of your Nagios system, in this example ''/usr/lib/nagios/plugins/'':
$ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/
$ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
<
- Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file ''/etc/nagios-plugins/config/check_snmp_traps.cfg'':
# check for snmp traps
define command {
command_name check_snmp_traps
command_line $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u -p -d
}
Replace ''user'', ''pass'' and ''snmptt_db'' with values suitable for your SNMPTT database environment. <
- Add another service in your Nagios configuration to be checked for each CoolLoop device:
# check snmptraps
define service {
use generic-service
hostgroup_name coolcon
service_description Check_SNMP_traps
check_command check_snmp_traps
}
<
- **Optional**: Define a serviceextinfo to display a folder icon next to the ''Check_SNMP_traps'' service check for each CoolLoop device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:
define serviceextinfo {
hostgroup_name coolcon
service_description Check_SNMP_traps
notes SNMP Alerts
#notes_url http:///nagios3/nagtrap/index.php?hostname=$HOSTNAME$
#notes_url http:///nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$
}
Uncomment the ''notes_url'' depending on which web interface (nagtrap or nsti) is used. Replace ''hostname'' with the FQDN or IP address of the server running the web interface. <
- Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
$ /etc/init.d/nagios3 reload
<
- **Optional**: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the PNP4Nagios templates in ''pnp4nagios_coolcon.tar.bz2'' to beautify the graphs. Download the PNP4Nagios templates {{:2014:01:14:pnp4nagios_coolcon.tar.bz2|pnp4nagios_coolcon.tar.bz2}} and place them in the PNP4Nagios template directory, in this example ''/usr/share/pnp4nagios/html/templates/'':
$ tar jxf pnp4nagios_coolcon.tar.bz2
$ mv -i check_coolcon_*.php /usr/share/pnp4nagios/html/templates/
$ chmod 644 /usr/share/pnp4nagios/html/templates/check_coolcon_*.php
The following image shows an example of what the PNP4Nagios graphs look like for a CoolLoop unit:
{{:2014:01:14:check_coolcon_fan.png?600x|PNP4Nagios graph for the relative fan speed in a CoolLoop device}}\\
{{:2014:01:14:check_coolcon_humidity.png?600x|PNP4Nagios graph for the relative humidity in a CoolLoop device}}\\
{{:2014:01:14:check_coolcon_valve.png?600x|PNP4Nagios graph for the relative valve setting in a CoolLoop device}}\\
{{:2014:01:14:check_coolcon_temperature.png?600x|PNP4Nagios graph for the air temperature on the warm and cold side of a CoolLoop device}}\\
{{:2014:01:14:check_coolcon_waterflow.png?600x|PNP4Nagios graph for the water flow and cooling power in a CoolLoop device}}\\
{{:2014:01:14:check_coolcon_watertemperature.png?600x|PNP4Nagios graph for the water in and out temperature in a CoolLoop device}}
<
All done, you should now have a complete Nagios-based monitoring solution for your Knürr / Emerson CoolLoop systems.