bityard Blog

// Nagios Monitoring - Rittal CMC-TC with LCP

We use Rittal LCP - Liquid Cooling Package units to chill the 19“ racks and equipment in the datacenters. The LCPs come with their own Rittal CMC-TC units for management and monitoring purposes. With check_rittal_health there is already a Nagios plugin to monitor Rittal CMC-TC units. Unfortunately this plugin didn't cover the LCPs, which come with a plethora of built-in sensors. Also the existing plugin didn't allow to set individual monitoring thresholds. Therefor i modified the existing plugin to accomodate our needs. The modified version can be downloaded here check_rittal_health.pl.

The whole setup for monitoring Rittal CMC-TC and LCP with Nagios looks like this:

  1. Configure your CMC-TC unit, see the manual here: CMC-TC Basic CMC DK 7320.111 - Montage, Installation und Bedienung. Essential are the network settings, a user for SNMPv3 access, a SNMP trap receiver (your Nagios server running SNMPTT). Optional, but highly recommended, are the settings for the NTP server, change of the default user passwords and disabling insecure services (Telnet, FTP, HTTP).

  2. Download the Nagios plugin check_rittal_health.pl and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_rittal_health.pl /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_rittal_health.pl
    
  3. Define the following Nagios commands. In this example this is done in the file /etc/nagios-plugins/config/check_cmc.cfg:

    # check Rittal CMC status
    define command {
        command_name    check_cmc_status
        command_line    $USER1$/check_rittal_health.pl --hostname $HOSTNAME$ --protocol $ARG1$ --username $ARG2$ --authpassword $ARG3$ --customthresholds "$ARG4$"
    }
  4. Verify that a generic check command for a running SNMPD is already present in your Nagios configuration. If not add a new check command like this:

    define command {
        command_name    check_snmpdv3
        command_line    $USER1$/check_snmp -H $HOSTADDRESS$ -o .1.3.6.1.2.1.1.3.0 -P 3 -t 30 -L $ARG1$ -U $ARG2$ -A $ARG3$
    }

    Verify that a generic check command for a SSH service is already present in your Nagios configuration. If not add a new check command like this:

    # 'check_ssh' command definition
    define command {
        command_name    check_ssh
        command_line    /usr/lib/nagios/plugins/check_ssh -t 20 '$HOSTADDRESS$'
    }

    Verify that a generic check command for a HTTPS service is already present in your Nagios configuration. If not add a new check command like this:

    # 'check_https_port_uri' command definition
    define command {
        command_name    check_https_port_uri
        command_line    /usr/lib/nagios/plugins/check_http --ssl -I '$HOSTADDRESS$' -p '$ARG1$' -u '$ARG2$'
    }
  5. Define a group of services in your Nagios configuration to be checked for each CMC-TC device:

    # check host alive
    define service {
        use                     generic-service-pnp
        hostgroup_name          cmc
        service_description     Check_host_alive
        check_command           check-host-alive
    }
    
    # check sshd
    define service {
        use                     generic-service
        hostgroup_name          cmc
        service_description     Check_SSH
        check_command           check_ssh
    }
    
    # check snmpd
    define service {
        use                     generic-service
        hostgroup_name          cmc
        service_description     Check_SNMPDv3
        check_command           check_snmpdv3!authNoPriv!<user>!<pass>
    }
    
    # check httpd
    define service {
        use                     generic-service-pnp
        hostgroup_name          cmc
        service_description     Check_service_https
        check_command           check_https_port_uri!443!/
    }
    
    # check Rittal CMC status
    define service {
        use                     generic-service-pnp
        servicegroups           snmpchecks
        hostgroup_name          cmc
        service_description     Check_CMC_Status
        check_command           check_cmc_status!3!<user>!<pass>!airTemp,15:32,10:35\;coolingCapacity,0:10000,0:15000\;events,0:1,0:2\;fan,450:2000,400:2500\;temp,15:30,10:35\;waterFlow,0:70,0:100\;waterTemp,10:25,5:30
    }

    Replace generic-service-pnp with your Nagios service template that has performance data processing enabled. Replace <user> and <pass> with the user credentials configured on the CMC-TC devices for SNMPv3 access. Adjust the sensor threshold settings according to your requirements, see the output of check_rittal_health.pl -h for an explaination of the threshold settings format.

  6. Define a service dependency to run the check Check_CMC_status only if the Check_SNMPDv3 was run successfully:

    # Rittal CMC SNMPD dependencies
    define servicedependency {
        hostgroup_name                  cmc
        service_description             Check_SNMPDv3
        dependent_service_description   Check_CMC_Status
        execution_failure_criteria      c,p,u,w
        notification_failure_criteria   c,p,u,w
    }
  7. Define hosts in your Nagios configuration for each CMC-TC device. In this example its named cmc-host1:

    define host {
        use         cmc
        host_name   cmc-host1
        alias       Rittal CMC LPC
        address     10.0.0.1
        parents     parent_lan
    }

    Replace cmc with your Nagios host template for the CMC-TC devices. Adjust the address and parents parameters according to your environment.

  8. Define a hostgroup in your Nagios configuration for all CMC-TC devices. In this example it is named cmc. The above checks are run against each member of the hostgroup:

    define hostgroup {
        hostgroup_name  cmc
        alias           Rittal CMC
        members         cmc-host1
    }
  9. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload
    

The new hosts and services should soon show up in the Nagios web interface.

If the Nagios server is running SNMPTT and was configured as a SNMP trap receiver in step number 1 in the above list, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from CMC-TC devices. This can be achieved by the following steps:

  1. Convert the Rittal SNMP MIB definitions in CMC-TC_MIB_v1.1h.txt into a format that SNMPTT can understand.

    $ /opt/snmptt/snmpttconvertmib --in=MIB/CMC-TC_MIB_v1.1h.txt --out=/opt/snmptt/conf/snmptt.conf.rittal-cmc-tc
    
    ...
    Done
    
    Total translations:        10
    Successful translations:   10
    Failed translations:       0
    
  2. The trap severity settings should be pretty reasonable by default, but you can edit them according to your requirements with:

    $ vim /opt/snmptt/conf/snmptt.conf.rittal-cmc-tc
    
  3. Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:

    $ vim /opt/snmptt/snmptt.ini
    
    ...
    [TrapFiles]
    snmptt_conf_files = <<END
    ...
    /opt/snmptt/conf/snmptt.conf.rittal-cmc-tc
    ...
    END
    
    $ /etc/init.d/snmptt reload
    
  4. Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
    
  5. Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file /etc/nagios-plugins/config/check_snmp_traps.cfg:

    # check for snmp traps
    define command {
        command_name    check_snmp_traps
        command_line    $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db>
    }

    Replace user, pass and snmptt_db with values suitable for your SNMPTT database environment.

  6. Add another service in your Nagios configuration to be checked for each CMC device:

    # check snmptraps
    define service {
        use                     generic-service
        hostgroup_name          cmc
        service_description     Check_SNMP_traps
        check_command           check_snmp_traps
    }
  7. Optional: Define a serviceextinfo to display a folder icon next to the Check_SNMP_traps service check for each CMC device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:

    define  serviceextinfo {
        hostgroup_name          cmc
        service_description     Check_SNMP_traps
        notes                   SNMP Alerts
        #notes_url               http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$
        #notes_url               http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$
    }

    Uncomment the notes_url depending on which web interface (nagtrap or nsti) is used. Replace hostname with the FQDN or IP address of the server running the web interface.

  8. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload
    
  9. Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the check_cmc_status.php PNP4Nagios template to beautify the graphs. Download the PNP4Nagios template check_cmc_status.php and place it in the PNP4Nagios template directory, in this example /usr/share/pnp4nagios/html/templates/:

    $ mv -i check_cmc_status.php /usr/share/pnp4nagios/html/templates/
    $ chmod 644 /usr/share/pnp4nagios/html/templates/check_cmc_status.php
    

    The following image shows an example of what the PNP4Nagios graphs look like for a Rittal CMC-TC with a LCP-T3+ unit:

    PNP4Nagios graphs for a Rittal CMC-TC with a LCP-T3+ unit

All done, you should now have a complete Nagios-based monitoring solution for your Rittal CMC-TC and LCP devices.

Comments

Andreas Ott
No. 1 @ 2017/04/28 11:52

Hello, I stumbled upon this articel on my search for monitoring our Rittal LPC with nagios.

I implemented the checks and services, but actually I receive an error:

(nagios) CMC CRITICAL - snmpwalk returns no mib status name (rittal-cmc-tc-mib), wrong device

Could you help me? H

Frank Fegert
No. 2 @ 2017/05/16 08:42

Hello Andreas,

sorry for the delayed reply!

I suspect the issue you're experiencing is either with a dependency of the check plugin or - since the check plugin is rather old - with a changed SNMP OID tree in newer Rittal CMC firmware versions.

Unfortunately i no longer have access to Rittal CMC systems with an attached LCP, so i cannot debug this myself. However i can try and assist you finding and fixing this issue.

Can you please run the check plugin with the “-vvv” option and a “snmpwalk -On …” against your Rittal CMC and send me the output of both commands?

Thanks & best regards,

Frank

Leave a comment…




B H A H G
  • E-Mail address will not be published.
  • Formatting:
    //italic//  __underlined__
    **bold**  ''preformatted''
  • Links:
    [[http://example.com]]
    [[http://example.com|Link Text]]
  • Quotation:
    > This is a quote. Don't forget the space in front of the text: "> "
  • Code:
    <code>This is unspecific source code</code>
    <code [lang]>This is specifc [lang] code</code>
    <code php><?php echo 'example'; ?></code>
    Available: html, css, javascript, bash, cpp, …
  • Lists:
    Indent your text by two spaces and use a * for
    each unordered list item or a - for ordered ones.
This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information