bityard Blog

// Nagios Monitoring - Knürr / Emerson CoolLoop

We use Knürr (now Emerson) CoolLoop units to chill the 19“ equipment in the datacenters. The CoolLoop units come with their own CoolCon controller units for management and monitoring purposes. The CoolCon controllers can - similar to the Rittal CMC-TC - be queried via SNMP for the status and values of the environmental sensors. The kind and number of sensors depends on specific configuration that was ordered. It ranges from simple fan, air temperature and water valve sensors in the basic setup to humidity, additional temperature, water flow, water temperature and electrical current, voltage and energy sensors in the extended setup. To monitor those sensor status and environmental values provided by the CoolCon controller i wrote a Nagios plugin check_knuerr_coolcon.sh. In order to run the Nagios plugin, you need to have SNMP activated on the CoolCon controller unit and a network connection from the Nagios system to the CoolCon controller unit on port UDP/161 must be allowed.

The whole setup for monitoring Knürr / Emerson CoolLoop - and possibly, but untested, also CoolTherm - units with Nagios looks like this:

  1. Enable SNMP queries on the CoolCon controller unit. Verify the port UDP/161 on the CoolCon controller unit can be reached from the Nagios system.

  2. Optional: Enable SNMP traps to be sent to the Nagios system on the CoolCon controller unit. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Verify the port UDP/162 on the Nagios system can be reached from the CoolCon controller unit.

  3. Download the Nagios plugin check_knuerr_coolcon.sh and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_knuerr_coolcon.sh /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_knuerr_coolcon.sh
  4. Define the following Nagios commands. In this example this is done in the file /etc/nagios-plugins/config/check_coolcon.cfg:

    # check Knuerr CoolCon/CoolLoop energy status
    define command {
        command_name    check_coolcon_energy
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C energy
    }
    # check Knuerr CoolCon/CoolLoop fan status
    define command {
        command_name    check_coolcon_fan
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C fan
    }
    # check Knuerr CoolCon/CoolLoop humidity status
    define command {
        command_name    check_coolcon_humidity
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C humidity
    }
    # check Knuerr CoolCon/CoolLoop temperature status
    define command {
        command_name    check_coolcon_temperature
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C temperature
    }
    # check Knuerr CoolCon/CoolLoop valve status
    define command {
        command_name    check_coolcon_valve
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C valve
    }
    # check Knuerr CoolCon/CoolLoop waterflow status
    define command {
        command_name    check_coolcon_waterflow
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C waterflow
    }
    # check Knuerr CoolCon/CoolLoop watertemperature status
    define command {
        command_name    check_coolcon_watertemperature
        command_line    $USER1$/check_knuerr_coolcon.sh -H $HOSTNAME$ -C watertemperature
    }
  5. Define a group of services in your Nagios configuration to be checked for each CoolLoop system:

    # check snmpd
    define service {
        use                     generic-service
        hostgroup_name          coolcon
        service_description     Check_SNMPDv2
        check_command           check_snmpdv2
    }
    # check_coolcon_energy
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Energy
        check_command           check_coolcon_energy
    }
    # check_coolcon_fan
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Fan
        check_command           check_coolcon_fan
    }
    # check_coolcon_humidity
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Humidity
        check_command           check_coolcon_humidity
    }
    # check_coolcon_temperature
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Temp
        check_command           check_coolcon_temperature
    }
    # check_coolcon_valve
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Valve
        check_command           check_coolcon_valve
    }
    # check_coolcon_waterflow
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_Waterflow
        check_command           check_coolcon_waterflow
    }
    # check_coolcon_watertemperature
    define service {
        use                     generic-service-pnp
        hostgroup_name          coolcon
        service_description     Check_CoolCon_WaterTemp
        check_command           check_coolcon_watertemperature
    }

    Replace generic-service with your Nagios service template. Replace generic-service-pnp with your Nagios service template that has performance data processing enabled.

  6. Define a service dependency to run the above checks only if the Check_SNMPDv2 was run successfully:

    # Knuerr CoolCon SNMPD dependencies
    define servicedependency {
        hostgroup_name                  coolcon
        service_description             Check_SNMPDv2
        dependent_service_description   Check_CoolCon_.*
        execution_failure_criteria      c,p,u,w
        notification_failure_criteria   c,p,u,w
    }
  7. Define hosts in your Nagios configuration for each CoolLoop device. In this example its named coolcon1:

    define host {
        use         coolcon
        host_name   coolcon1
        alias       Knuerr CoolCon CoolLoop 1
        address     10.0.0.1
        parents     parent_lan
    }

    Replace coolcon with your Nagios host template for the CoolCon controller units. Adjust the address and parents parameters according to your environment.

  8. Define a hostgroup in your Nagios configuration for all CoolLoop devices. In this example it is named coolcon. The above checks are run against each member of the hostgroup:

    define hostgroup {
        hostgroup_name  coolcon
        alias           Knuerr CoolCon/CoolLoop
        members         coolcon1
    }
  9. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload

The new hosts and services should soon show up in the Nagios web interface.

If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from CoolCon controller units. This can be achieved by the following steps:

  1. Request a current version of the CoolCon SNMP MIB file from Knürr / Emerson. In this example it's 080104140000010a_KNUERR-COOLCON-MIB-V10.mib. Transfer the file 080104140000010a_KNUERR-COOLCON-MIB-V10.mib to the Nagios server.

  2. Convert the SNMP MIB definitions in 080104140000010a_KNUERR-COOLCON-MIB-V10.mib into a format that SNMPTT can understand.

    $ /opt/snmptt/snmpttconvertmib --in=MIB/080104140000010a_KNUERR-COOLCON-MIB-V10.mib --out=/opt/snmptt/conf/snmptt.conf.knuerr-coolcon
    ...
    Done
    
    Total translations:        201
    Successful translations:   201
    Failed translations:       0
  3. Edit the trap severity according to your requirements, e.g.:

    $ vim /opt/snmptt/conf/snmptt.conf.knuerr-coolcon
    
    ...
    EVENT fans .1.3.6.1.4.1.2769.2.1.5.0.1 "Status Events" Warning
    ...
  4. Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:

    $ vim /opt/snmptt/snmptt.ini
    
    ...
    [TrapFiles]
    snmptt_conf_files = <<END
    ...
    /opt/snmptt/conf/snmptt.conf.knuerr-coolcon
    ...
    END
    
    $ /etc/init.d/snmptt reload
  5. Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
  6. Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file /etc/nagios-plugins/config/check_snmp_traps.cfg:

    # check for snmp traps
    define command {
        command_name    check_snmp_traps
        command_line    $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db>
    }

    Replace user, pass and snmptt_db with values suitable for your SNMPTT database environment.

  7. Add another service in your Nagios configuration to be checked for each CoolLoop device:

    # check snmptraps
    define service {
        use                     generic-service
        hostgroup_name          coolcon
        service_description     Check_SNMP_traps
        check_command           check_snmp_traps
    }
  8. Optional: Define a serviceextinfo to display a folder icon next to the Check_SNMP_traps service check for each CoolLoop device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:

    define serviceextinfo {
        hostgroup_name          coolcon
        service_description     Check_SNMP_traps
        notes                   SNMP Alerts
        #notes_url               http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$
        #notes_url               http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$
    }

    Uncomment the notes_url depending on which web interface (nagtrap or nsti) is used. Replace hostname with the FQDN or IP address of the server running the web interface.

  9. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload
  10. Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the PNP4Nagios templates in pnp4nagios_coolcon.tar.bz2 to beautify the graphs. Download the PNP4Nagios templates pnp4nagios_coolcon.tar.bz2 and place them in the PNP4Nagios template directory, in this example /usr/share/pnp4nagios/html/templates/:

    $ tar jxf pnp4nagios_coolcon.tar.bz2
    $ mv -i check_coolcon_*.php /usr/share/pnp4nagios/html/templates/
    $ chmod 644 /usr/share/pnp4nagios/html/templates/check_coolcon_*.php

    The following image shows an example of what the PNP4Nagios graphs look like for a CoolLoop unit:

    PNP4Nagios graph for the relative fan speed in a CoolLoop device
    PNP4Nagios graph for the relative humidity in a CoolLoop device
    PNP4Nagios graph for the relative valve setting in a CoolLoop device
    PNP4Nagios graph for the air temperature on the warm and cold side of a CoolLoop device
    PNP4Nagios graph for the water flow and cooling power in a CoolLoop device
    PNP4Nagios graph for the water in and out temperature in a CoolLoop device

All done, you should now have a complete Nagios-based monitoring solution for your Knürr / Emerson CoolLoop systems.

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website. More information about cookies