bityard Blog

// Nagios Monitoring - Infortrend EonStor (Update)

After gaining some operational experience with the previously introduced Nagios plugin for Infortrend EonStor and EonStor DS storage arrays, some issues became apparent which needed to be addressed in an update. The updated version of the Nagios plugin check_infortrend.sh fixes the following issues:

  1. Newer array firmware versions use different severity strings for event log entries. The Nagios plugin has been adapted to be able to cope with these.

  2. Due to a limitation in the array firmware, event log entries available via SNMP are not cleared when the corresponding event log entries are cleared via RAIDWatch or SANWatch. They are only cleared during a controller reboot. A support case with Infortrend confirmed this and it is currently under investigation whether this feature could be added for future releases of the array firmware.

    In the meantime a workaround to address this behaviour has been added to the Nagios plugin. In order to not constantly get alarms about already addressed events, the date of the event will now – along with the already used severity of the event – be taken into consideration. By default only events within the past 60 minutes will be evaluated. The extent of time to look into the past can be overridden from the default with the new “-t <minutes>” command line option of the Nagios plugin. E.g. “-t 120” will evaluate events within the past 2 hours. You might need to adjust the definition of the check_infortrend_events command to suite your environment, e.g.:

    # check Infortrend ESDS cache status
    define command {
        command_name    check_infortrend_events
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C events -t <minutes>
    }

    There are several pitfalls here though:

    • Make sure the timeframe aligns with your Nagios configuration options max_check_attempts, normal_check_interval and retry_check_interval to actually allow non-normal events to trigger an alarm. Generally speaking, the cumulated timeframe of the three Nagios configuration options of your Check_IFT_Events service check must be smaller than the timeframe given in the definition of the check_infortrend_events command.

    • Make sure you set the date and time on the array to the correct values or preferably use a SNTP server. Also, make sure the Nagios server and the array use the same timezone, preferably UTC in both cases.

  3. Not exactly an issue with the Nagios plugin per se, but related to the monitoring of Infortrend storage arrays in general, is an adapted SNMPTT configuration. Since all SNMP traps generated from events of Infortrend arrays are indiscriminately reported with the same OID, a bit more extensive parsing of the SNMP traps needs to occur. In order to achieve this, the following entries should be used in the snmptt.conf.infortrend SNMPTT configuration file:

    /opt/snmptt/conf/snmptt.conf.infortrend
    EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Critical
    FORMAT The description of the event $*
    MATCH $*: ( \[Alert Condition\])
    MATCH $*: ( \[Critical\])
    MATCH $*: ( \[Critical Error\])
    MATCH $*: ( \[Error\])
    SDESC
    The description of the event
    Variables:
    EDESC
     
    EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Warning
    FORMAT The description of the event $*
    MATCH $*: ( \[Warning Condition\])
    MATCH $*: ( \[Warning\])
    SDESC
    The description of the event
    Variables:
    EDESC
     
    EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Normal
    FORMAT The description of the event $*
    MATCH $*: ( \[Information\])
    MATCH $*: ( \[Notification\])
    SDESC
    The description of the event
    Variables:
    EDESC

    and the SNMPTT daemon should be restarted.

If you happen to find any additional issues with monitoring your Infortrend systems with this Nagios plugin, please feel free to leave a comment or drop me a note via email.

// Nagios Monitoring - Infortrend EonStor

Please be sure to also read the update Nagios Monitoring - Infortrend EonStor (Update) to this blog post.

We use several Infortrend EonStor and EonStor DS storage arrays as low-cost, bulk storage units in our datacenters. With check_infortrend.pl, check_infortrend and check_ift_{dev|hdd|ld}.pl there are already several Nagios plugin to monitor Infortrend EonStor storage arrays. Since i wanted a low overhead, shell-based plugin with support for performance data, i decided to write my own version check_infortrend.sh. In order to run the Nagios plugin, you need to have SNMP activated on the Infortrend storage array and a network connection from the Nagios system to the Infortrend device on port UDP/161 must be allowed.

The whole setup looks like this:

  1. Enable SNMP queries on the Infortrend storage array. Login via Telnet or SSH and navigate to:

    -> view and edit Configuration parameters
       -> Communication Parameters
          -> Network Protocol Support
             -> SNMP - Disabled
                -> Enable SNMP Protocol?
                   -> Yes

    Verify the port UDP/161 on the Infortrend device can be reached from the Nagios system.

  2. Optional: Enable SNMP traps to be sent to the Nagios system on the Infortrend storage array. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Verify the port UDP/162 on the Nagios system can be reached from the Infortrend device.

  3. Download the Nagios plugin check_infortrend.sh and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_infortrend.sh /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_infortrend.sh
    
  4. Define the following Nagios commands. In this example this is done in the file /etc/nagios-plugins/config/check_infortrend.cfg:

    # check Infortrend ESDS cache status
    define command {
        command_name    check_infortrend_cache
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C cache
    }
    # check Infortrend ESDS controller status
    define command {
        command_name    check_infortrend_controller
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C controller
    }
    # check Infortrend ESDS disk status
    define command {
        command_name    check_infortrend_disk
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C disk
    }
    # check Infortrend ESDS logicaldrive status
    define command {
        command_name    check_infortrend_logicaldrive
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C logicaldrive
    }
    # check Infortrend ESDS logicalunit status
    define command {
        command_name    check_infortrend_logicalunit
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C logicalunit
    }
    # check Infortrend ESDS event status
    define command {
        command_name    check_infortrend_events
        command_line    $USER1$/check_infortrend.sh -H $HOSTNAME$ -C events
    }
  5. Define a group of services in your Nagios configuration to be checked for each Infortrend system:

    # check snmpd
    define service {
        use                     generic-service
        hostgroup_name          infortrend
        service_description     Check_SNMPD
        check_command           check_snmpd
    }
    # check_infortrend_cache
    define service {
        use                     generic-service-pnp
        hostgroup_name          infortrend
        service_description     Check_IFT_Cache
        check_command           check_infortrend_cache
    }
    # check_infortrend_controller
    define service {
        use                     generic-service
        hostgroup_name          infortrend
        service_description     Check_IFT_Controller
        check_command           check_infortrend_controller
    }
    # check_infortrend_disk
    define service {
        use                     generic-service-pnp
        hostgroup_name          infortrend
        service_description     Check_IFT_Disk
        check_command           check_infortrend_disk
    }
    # check_infortrend_logicaldrive
    define service {
        use                     generic-service-pnp
        hostgroup_name          infortrend
        service_description     Check_IFT_LogicalDrive
        check_command           check_infortrend_logicaldrive
    }
    # check_infortrend_logicalunit
    define service {
        use                     generic-service-pnp
        hostgroup_name          infortrend
        service_description     Check_IFT_LogicalUnit
        check_command           check_infortrend_logicalunit
    }
    # check_infortrend_events
    define service {
        use                     generic-service
        hostgroup_name          infortrend
        service_description     Check_IFT_Events
        check_command           check_infortrend_events
    }

    Replace generic-service with your Nagios service template. Replace generic-service-pnp with your Nagios service template that has performance data processing enabled.

  6. Define a service dependency to run the above checks only if the Check_SNMPD was run successfully:

    # Infortrend SNMPD dependencies
    define servicedependency {
        hostgroup_name                  infortrend
        service_description             Check_SNMPD
        dependent_service_description   Check_IFT_.*
        execution_failure_criteria      c,p,u,w
        notification_failure_criteria   c,p,u,w
    }
  7. Define hosts in your Nagios configuration for each Infortrend device. In this example its named esds1:

    define host {
        use         disk
        host_name   esds1
        alias       Infortrend Disk Storage 1
        address     10.0.0.1
        parents     parent_lan
    }

    Replace disk with your Nagios host template for storage devices. Adjust the address and parents parameters according to your environment.

  8. Define a hostgroup in your Nagios configuration for all Infortrend devices. In this example it is named infortrend. The above checks are run against each member of the hostgroup:

    define hostgroup {
        hostgroup_name  infortrend
        alias           Infortrend Disk Storages
        members         esds1
    }
  9. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload
    

The new hosts and services should soon show up in the Nagios web interface.

If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from Infortrend devices. This can be achieved by the following steps:

  1. Request a current version of the Infortrend SNMP MIB file from Infortrend support. In this example it's IFT_MIB_v1.40A02.mib. Transfer the file IFT_MIB_v1.40A02.mib to the Nagios server.

  2. Convert the SNMP MIB definitions in IFT_MIB_v1.40A02.mib into a format that SNMPTT can understand.

    $ /opt/snmptt/snmpttconvertmib --in=MIB/IFT_MIB_v1.40A02.mib --out=/opt/snmptt/conf/snmptt.conf.infortrend
    ...
    Done
    
    Total translations:        1
    Successful translations:   1
    Failed translations:       0
    
  3. Edit the trap severity according to your requirements, e.g.:

    $ vim /opt/snmptt/conf/snmptt.conf.infortrend
    
    ...
    EVENT iftEventText .1.3.6.1.4.1.1714.2.1.1 "Status Events" Warning
    ...
    
  4. Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:

    $ vim /opt/snmptt/snmptt.ini
    
    ...
    [TrapFiles]
    snmptt_conf_files = <<END
    ...
    /etc/snmptt/conf.d/snmptt.conf.infortrend
    ...
    END
    
    $ /etc/init.d/snmptt reload
    
  5. Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example /usr/lib/nagios/plugins/:

    $ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/
    $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
    
  6. Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file /etc/nagios-plugins/config/check_snmp_traps.cfg:

    # check for snmp traps
    define command {
        command_name    check_snmp_traps
        command_line    $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db>
    }

    Replace user, pass and snmptt_db with values suitable for your SNMPTT database environment.

  7. Add another service in your Nagios configuration to be checked for each Infortrend device:

    # check snmptraps
    define service {
        use                     generic-service
        hostgroup_name          infortrend
        service_description     Check_SNMP_traps
        check_command           check_snmp_traps
    }
  8. Optional: Define a serviceextinfo to display a folder icon next to the Check_SNMP_traps service check for each Infortrend device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:

    define serviceextinfo {
        hostgroup_name          infortrend
        service_description     Check_SNMP_traps
        notes                   SNMP Alerts
        #notes_url               http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$
        #notes_url               http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$
    }

    Uncomment the notes_url depending on which web interface (nagtrap or nsti) is used. Replace hostname with the FQDN or IP address of the server running the web interface.

  9. Run a configuration check and if successful reload the Nagios process:

    $ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg
    $ /etc/init.d/nagios3 reload
    
  10. Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the check_infortrend_cache.php, check_infortrend_disk.php, check_infortrend_logicaldrive.php and check_infortrend_logicalunit.php PNP4Nagios templates to beautify the graphs. Download the PNP4Nagios templates check_infortrend_cache.php, check_infortrend_disk.php, check_infortrend_logicaldrive.php and check_infortrend_logicalunit.php and place them in the PNP4Nagios template directory, in this example /usr/share/pnp4nagios/html/templates/:

    $ mv -i check_infortrend_*. /usr/share/pnp4nagios/html/templates/
    $ chmod 644 /usr/share/pnp4nagios/html/templates/check_infortrend_*.php
    

    The following image shows an example of what the PNP4Nagios graphs look like for a Infortrend EonStor unit:

    PNP4Nagios graph for the cache usage in a Infortrend EonStor unit
    PNP4Nagios graph for the physical disk throughput in a Infortrend EonStor unit
    PNP4Nagios graphs for the logical disk throughput usage in a Infortrend EonStor unit
    PNP4Nagios graph for a temperature sensor in a Infortrend EonStor unit
    PNP4Nagios graph for a voltage sensor in a Infortrend EonStor unit

All done, you should now have a complete Nagios-based monitoring solution for your Infortrend EonStor systems.

This website uses cookies for visitor traffic analysis. By using the website, you agree with storing the cookies on your computer.More information