2014-07-22 // Nagios Monitoring - Infortrend EonStor (Update)
After gaining some operational experience with the previously introduced Nagios plugin for Infortrend EonStor and EonStor DS storage arrays, some issues became apparent which needed to be addressed in an update. The updated version of the Nagios plugin check_infortrend.sh fixes the following issues:
Newer array firmware versions use different severity strings for event log entries. The Nagios plugin has been adapted to be able to cope with these.
Due to a limitation in the array firmware, event log entries available via SNMP are not cleared when the corresponding event log entries are cleared via RAIDWatch or SANWatch. They are only cleared during a controller reboot. A support case with Infortrend confirmed this and it is currently under investigation whether this feature could be added for future releases of the array firmware.
In the meantime a workaround to address this behaviour has been added to the Nagios plugin. In order to not constantly get alarms about already addressed events, the date of the event will now – along with the already used severity of the event – be taken into consideration. By default only events within the past 60 minutes will be evaluated. The extent of time to look into the past can be overridden from the default with the new “
-t <minutes>
” command line option of the Nagios plugin. E.g. “-t 120
” will evaluate events within the past 2 hours. You might need to adjust the definition of thecheck_infortrend_events
command to suite your environment, e.g.:# check Infortrend ESDS cache status define command { command_name check_infortrend_events command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C events -t <minutes> }
There are several pitfalls here though:
Make sure the timeframe aligns with your Nagios configuration options
max_check_attempts
,normal_check_interval
andretry_check_interval
to actually allow non-normal events to trigger an alarm. Generally speaking, the cumulated timeframe of the three Nagios configuration options of yourCheck_IFT_Events
service check must be smaller than the timeframe given in the definition of thecheck_infortrend_events
command.Make sure you set the date and time on the array to the correct values or preferably use a SNTP server. Also, make sure the Nagios server and the array use the same timezone, preferably UTC in both cases.
Not exactly an issue with the Nagios plugin per se, but related to the monitoring of Infortrend storage arrays in general, is an adapted SNMPTT configuration. Since all SNMP traps generated from events of Infortrend arrays are indiscriminately reported with the same OID, a bit more extensive parsing of the SNMP traps needs to occur. In order to achieve this, the following entries should be used in the
snmptt.conf.infortrend
SNMPTT configuration file:- /opt/snmptt/conf/snmptt.conf.infortrend
EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Critical FORMAT The description of the event $* MATCH $*: ( \[Alert Condition\]) MATCH $*: ( \[Critical\]) MATCH $*: ( \[Critical Error\]) MATCH $*: ( \[Error\]) SDESC The description of the event Variables: EDESC EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Warning FORMAT The description of the event $* MATCH $*: ( \[Warning Condition\]) MATCH $*: ( \[Warning\]) SDESC The description of the event Variables: EDESC EVENT iftEventText .1.3.6.1.4.1.1714.0.1 "Status Events" Normal FORMAT The description of the event $* MATCH $*: ( \[Information\]) MATCH $*: ( \[Notification\]) SDESC The description of the event Variables: EDESC
and the SNMPTT daemon should be restarted.
If you happen to find any additional issues with monitoring your Infortrend systems with this Nagios plugin, please feel free to leave a comment or drop me a note via email.
2014-01-12 // Nagios Monitoring - Infortrend EonStor
We use several Infortrend EonStor and EonStor DS storage arrays as low-cost, bulk storage units in our datacenters. With check_infortrend.pl, check_infortrend and check_ift_{dev|hdd|ld}.pl there are already several Nagios plugin to monitor Infortrend EonStor storage arrays. Since i wanted a low overhead, shell-based plugin with support for performance data, i decided to write my own version check_infortrend.sh
. In order to run the Nagios plugin, you need to have SNMP activated on the Infortrend storage array and a network connection from the Nagios system to the Infortrend device on port UDP/161 must be allowed.
The whole setup looks like this:
Enable SNMP queries on the Infortrend storage array. Login via Telnet or SSH and navigate to:
-> view and edit Configuration parameters -> Communication Parameters -> Network Protocol Support -> SNMP - Disabled -> Enable SNMP Protocol? -> Yes
Verify the port UDP/161 on the Infortrend device can be reached from the Nagios system.
Optional: Enable SNMP traps to be sent to the Nagios system on the Infortrend storage array. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Verify the port UDP/162 on the Nagios system can be reached from the Infortrend device.
Download the Nagios plugin check_infortrend.sh and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_infortrend.sh /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_infortrend.sh
Define the following Nagios commands. In this example this is done in the file
/etc/nagios-plugins/config/check_infortrend.cfg
:# check Infortrend ESDS cache status define command { command_name check_infortrend_cache command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C cache } # check Infortrend ESDS controller status define command { command_name check_infortrend_controller command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C controller } # check Infortrend ESDS disk status define command { command_name check_infortrend_disk command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C disk } # check Infortrend ESDS logicaldrive status define command { command_name check_infortrend_logicaldrive command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C logicaldrive } # check Infortrend ESDS logicalunit status define command { command_name check_infortrend_logicalunit command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C logicalunit } # check Infortrend ESDS event status define command { command_name check_infortrend_events command_line $USER1$/check_infortrend.sh -H $HOSTNAME$ -C events }
Define a group of services in your Nagios configuration to be checked for each Infortrend system:
# check snmpd define service { use generic-service hostgroup_name infortrend service_description Check_SNMPD check_command check_snmpd } # check_infortrend_cache define service { use generic-service-pnp hostgroup_name infortrend service_description Check_IFT_Cache check_command check_infortrend_cache } # check_infortrend_controller define service { use generic-service hostgroup_name infortrend service_description Check_IFT_Controller check_command check_infortrend_controller } # check_infortrend_disk define service { use generic-service-pnp hostgroup_name infortrend service_description Check_IFT_Disk check_command check_infortrend_disk } # check_infortrend_logicaldrive define service { use generic-service-pnp hostgroup_name infortrend service_description Check_IFT_LogicalDrive check_command check_infortrend_logicaldrive } # check_infortrend_logicalunit define service { use generic-service-pnp hostgroup_name infortrend service_description Check_IFT_LogicalUnit check_command check_infortrend_logicalunit } # check_infortrend_events define service { use generic-service hostgroup_name infortrend service_description Check_IFT_Events check_command check_infortrend_events }
Replace
generic-service
with your Nagios service template. Replacegeneric-service-pnp
with your Nagios service template that has performance data processing enabled.Define a service dependency to run the above checks only if the
Check_SNMPD
was run successfully:# Infortrend SNMPD dependencies define servicedependency { hostgroup_name infortrend service_description Check_SNMPD dependent_service_description Check_IFT_.* execution_failure_criteria c,p,u,w notification_failure_criteria c,p,u,w }
Define hosts in your Nagios configuration for each Infortrend device. In this example its named
esds1
:define host { use disk host_name esds1 alias Infortrend Disk Storage 1 address 10.0.0.1 parents parent_lan }
Replace
disk
with your Nagios host template for storage devices. Adjust theaddress
andparents
parameters according to your environment.Define a hostgroup in your Nagios configuration for all Infortrend devices. In this example it is named
infortrend
. The above checks are run against each member of the hostgroup:define hostgroup { hostgroup_name infortrend alias Infortrend Disk Storages members esds1 }
Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
The new hosts and services should soon show up in the Nagios web interface.
If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from Infortrend devices. This can be achieved by the following steps:
Request a current version of the Infortrend SNMP MIB file from Infortrend support. In this example it's
IFT_MIB_v1.40A02.mib
. Transfer the fileIFT_MIB_v1.40A02.mib
to the Nagios server.Convert the SNMP MIB definitions in
IFT_MIB_v1.40A02.mib
into a format that SNMPTT can understand.$ /opt/snmptt/snmpttconvertmib --in=MIB/IFT_MIB_v1.40A02.mib --out=/opt/snmptt/conf/snmptt.conf.infortrend ... Done Total translations: 1 Successful translations: 1 Failed translations: 0
Edit the trap severity according to your requirements, e.g.:
$ vim /opt/snmptt/conf/snmptt.conf.infortrend ... EVENT iftEventText .1.3.6.1.4.1.1714.2.1.1 "Status Events" Warning ...
Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:
$ vim /opt/snmptt/snmptt.ini ... [TrapFiles] snmptt_conf_files = <<END ... /etc/snmptt/conf.d/snmptt.conf.infortrend ... END $ /etc/init.d/snmptt reload
Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file
/etc/nagios-plugins/config/check_snmp_traps.cfg
:# check for snmp traps define command { command_name check_snmp_traps command_line $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db> }
Replace
user
,pass
andsnmptt_db
with values suitable for your SNMPTT database environment.Add another service in your Nagios configuration to be checked for each Infortrend device:
# check snmptraps define service { use generic-service hostgroup_name infortrend service_description Check_SNMP_traps check_command check_snmp_traps }
Optional: Define a serviceextinfo to display a folder icon next to the
Check_SNMP_traps
service check for each Infortrend device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host:define serviceextinfo { hostgroup_name infortrend service_description Check_SNMP_traps notes SNMP Alerts #notes_url http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$ #notes_url http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$ }
Uncomment the
notes_url
depending on which web interface (nagtrap or nsti) is used. Replacehostname
with the FQDN or IP address of the server running the web interface.Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the
check_infortrend_cache.php
,check_infortrend_disk.php
,check_infortrend_logicaldrive.php
andcheck_infortrend_logicalunit.php
PNP4Nagios templates to beautify the graphs. Download the PNP4Nagios templates check_infortrend_cache.php, check_infortrend_disk.php, check_infortrend_logicaldrive.php and check_infortrend_logicalunit.php and place them in the PNP4Nagios template directory, in this example/usr/share/pnp4nagios/html/templates/
:$ mv -i check_infortrend_*. /usr/share/pnp4nagios/html/templates/ $ chmod 644 /usr/share/pnp4nagios/html/templates/check_infortrend_*.php
The following image shows an example of what the PNP4Nagios graphs look like for a Infortrend EonStor unit:
All done, you should now have a complete Nagios-based monitoring solution for your Infortrend EonStor systems.