2014-01-14 // Nagios Monitoring - Knürr / Emerson CoolLoop
We use Knürr (now Emerson) CoolLoop units to chill the 19“ equipment in the datacenters. The CoolLoop units come with their own CoolCon controller units for management and monitoring purposes. The CoolCon controllers can - similar to the Rittal CMC-TC - be queried via SNMP for the status and values of the environmental sensors. The kind and number of sensors depends on specific configuration that was ordered. It ranges from simple fan, air temperature and water valve sensors in the basic setup to humidity, additional temperature, water flow, water temperature and electrical current, voltage and energy sensors in the extended setup. To monitor those sensor status and environmental values provided by the CoolCon controller i wrote a Nagios plugin check_knuerr_coolcon.sh
. In order to run the Nagios plugin, you need to have SNMP activated on the CoolCon controller unit and a network connection from the Nagios system to the CoolCon controller unit on port UDP/161 must be allowed.
The whole setup for monitoring Knürr / Emerson CoolLoop - and possibly, but untested, also CoolTherm - units with Nagios looks like this:
Enable SNMP queries on the CoolCon controller unit. Verify the port UDP/161 on the CoolCon controller unit can be reached from the Nagios system.
Optional: Enable SNMP traps to be sent to the Nagios system on the CoolCon controller unit. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Verify the port UDP/162 on the Nagios system can be reached from the CoolCon controller unit.
Download the Nagios plugin check_knuerr_coolcon.sh and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_knuerr_coolcon.sh /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_knuerr_coolcon.sh
Define the following Nagios commands. In this example this is done in the file
/etc/nagios-plugins/config/check_coolcon.cfg
:# check Knuerr CoolCon/CoolLoop energy status define command { command_name check_coolcon_energy command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C energy } # check Knuerr CoolCon/CoolLoop fan status define command { command_name check_coolcon_fan command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C fan } # check Knuerr CoolCon/CoolLoop humidity status define command { command_name check_coolcon_humidity command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C humidity } # check Knuerr CoolCon/CoolLoop temperature status define command { command_name check_coolcon_temperature command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C temperature } # check Knuerr CoolCon/CoolLoop valve status define command { command_name check_coolcon_valve command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C valve } # check Knuerr CoolCon/CoolLoop waterflow status define command { command_name check_coolcon_waterflow command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C waterflow } # check Knuerr CoolCon/CoolLoop watertemperature status define command { command_name check_coolcon_watertemperature command_line $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C watertemperature }
Define a group of services in your Nagios configuration to be checked for each CoolLoop system:
# check snmpd define service { use generic-service hostgroup_name coolcon service_description Check_SNMPDv2 check_command check_snmpdv2 } # check_coolcon_energy define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Energy check_command check_coolcon_energy } # check_coolcon_fan define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Fan check_command check_coolcon_fan } # check_coolcon_humidity define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Humidity check_command check_coolcon_humidity } # check_coolcon_temperature define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Temp check_command check_coolcon_temperature } # check_coolcon_valve define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Valve check_command check_coolcon_valve } # check_coolcon_waterflow define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_Waterflow check_command check_coolcon_waterflow } # check_coolcon_watertemperature define service { use generic-service-pnp hostgroup_name coolcon service_description Check_CoolCon_WaterTemp check_command check_coolcon_watertemperature }
Replace
generic-service
with your Nagios service template. Replacegeneric-service-pnp
with your Nagios service template that has performance data processing enabled.Define a service dependency to run the above checks only if the
Check_SNMPDv2
was run successfully:# Knuerr CoolCon SNMPD dependencies define servicedependency { hostgroup_name coolcon service_description Check_SNMPDv2 dependent_service_description Check_CoolCon_.* execution_failure_criteria c,p,u,w notification_failure_criteria c,p,u,w }
Define hosts in your Nagios configuration for each CoolLoop device. In this example its named
coolcon1
:define host { use coolcon host_name coolcon1 alias Knuerr CoolCon CoolLoop 1 address 10.0.0.1 parents parent_lan }
Replace
coolcon
with your Nagios host template for the CoolCon controller units. Adjust theaddress
andparents
parameters according to your environment.Define a hostgroup in your Nagios configuration for all CoolLoop devices. In this example it is named
coolcon
. The above checks are run against each member of the hostgroup:define hostgroup { hostgroup_name coolcon alias Knuerr CoolCon/CoolLoop members coolcon1 }
Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
The new hosts and services should soon show up in the Nagios web interface.
If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from CoolCon controller units. This can be achieved by the following steps:
Request a current version of the CoolCon SNMP MIB file from Knürr / Emerson. In this example it's
080104140000010a_KNUERR-COOLCON-MIB-V10.mib
. Transfer the file080104140000010a_KNUERR-COOLCON-MIB-V10.mib
to the Nagios server.Convert the SNMP MIB definitions in
080104140000010a_KNUERR-COOLCON-MIB-V10.mib
into a format that SNMPTT can understand.$ /opt/snmptt/snmpttconvertmib --in=MIB/080104140000010a_KNUERR-COOLCON-MIB-V10.mib --out=/opt/snmptt/conf/snmptt.conf.knuerr-coolcon ... Done Total translations: 201 Successful translations: 201 Failed translations: 0
Edit the trap severity according to your requirements, e.g.:
$ vim /opt/snmptt/conf/snmptt.conf.knuerr-coolcon ... EVENT fans .1.3.6.1.4.1.2769.2.1.5.0.1 "Status Events" Warning ...
Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:
$ vim /opt/snmptt/snmptt.ini ... [TrapFiles] snmptt_conf_files = <<END ... /opt/snmptt/conf/snmptt.conf.knuerr-coolcon ... END $ /etc/init.d/snmptt reload
Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file
/etc/nagios-plugins/config/check_snmp_traps.cfg
:# check for snmp traps define command { command_name check_snmp_traps command_line $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db> }
Replace
user
,pass
andsnmptt_db
with values suitable for your SNMPTT database environment.Add another service in your Nagios configuration to be checked for each CoolLoop device:
# check snmptraps define service { use generic-service hostgroup_name coolcon service_description Check_SNMP_traps check_command check_snmp_traps }
Optional: Define a serviceextinfo to display a folder icon next to the
Check_SNMP_traps
service check for each CoolLoop device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:define serviceextinfo { hostgroup_name coolcon service_description Check_SNMP_traps notes SNMP Alerts #notes_url http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$ #notes_url http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$ }
Uncomment the
notes_url
depending on which web interface (nagtrap or nsti) is used. Replacehostname
with the FQDN or IP address of the server running the web interface.Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the PNP4Nagios templates in
pnp4nagios_coolcon.tar.bz2
to beautify the graphs. Download the PNP4Nagios templates pnp4nagios_coolcon.tar.bz2 and place them in the PNP4Nagios template directory, in this example/usr/share/pnp4nagios/html/templates/
:$ tar jxf pnp4nagios_coolcon.tar.bz2 $ mv -i check_coolcon_*.php /usr/share/pnp4nagios/html/templates/ $ chmod 644 /usr/share/pnp4nagios/html/templates/check_coolcon_*.php
The following image shows an example of what the PNP4Nagios graphs look like for a CoolLoop unit:
All done, you should now have a complete Nagios-based monitoring solution for your Knürr / Emerson CoolLoop systems.
Comments
Leave a comment…
- E-Mail address will not be published.
- Formatting:
//italic// __underlined__
**bold**''preformatted''
- Links:
[[http://example.com]]
[[http://example.com|Link Text]] - Quotation:
> This is a quote. Don't forget the space in front of the text: "> "
- Code:
<code>This is unspecific source code</code>
<code [lang]>This is specifc [lang] code</code>
<code php><?php echo 'example'; ?></code>
Available: html, css, javascript, bash, cpp, … - Lists:
Indent your text by two spaces and use a * for
each unordered list item or a - for ordered ones.
Can you also show screenshots of the perfdata output shown in Nagios ?