Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
2012:10:02:nagios_monitoring_emc_clariion [2012/10/02 11:58] – created Frank Fegert | 2012:10:02:nagios_monitoring_emc_clariion [2013/11/21 17:13] (current) – Frank Fegert | ||
---|---|---|---|
Line 3: | Line 3: | ||
Some time ago i wrote a - rather crude - Nagios plugin to monitor EMC Clariion storage arrays, specifically the CX4-120 model. The plugin isn't very pretty, but it'll do in a pinch ;-) In order to run it, you need to have the command line tools '' | Some time ago i wrote a - rather crude - Nagios plugin to monitor EMC Clariion storage arrays, specifically the CX4-120 model. The plugin isn't very pretty, but it'll do in a pinch ;-) In order to run it, you need to have the command line tools '' | ||
- | Since the Nagios server in my setup runs on Debian/PPC on a IBM Power LPAR and there is no native Linux/PPC version of '' | + | Since the Nagios server in my setup runs on Debian/PPC on a IBM Power LPAR and there is no native Linux/PPC version of '' |
The whole setup looks like this: | The whole setup looks like this: | ||
Line 24: | Line 24: | ||
</ | </ | ||
Verify the port **UDP/161** on the Clariion device can be reached from the Nagios system. < | Verify the port **UDP/161** on the Clariion device can be reached from the Nagios system. < | ||
- | - **Optional**: | + | - **Optional**: |
< | < | ||
-> Monitors tab | -> Monitors tab | ||
Line 34: | Line 34: | ||
-> <Select the events you're interested in> | -> <Select the events you're interested in> | ||
-> SNMP tab | -> SNMP tab | ||
- | -> <Enter IP of the Nagios system and the SNMPDs community string | + | -> <Enter IP of the Nagios system and the SNMPDs community string> |
-> Apply or OK | -> Apply or OK | ||
-> SP A | -> SP A | ||
Line 48: | Line 48: | ||
</ | </ | ||
Verify the port **UDP/162** on the Nagios system can be reached from the Clariion devices. < | Verify the port **UDP/162** on the Nagios system can be reached from the Clariion devices. < | ||
- | - Install '' | + | - Install '' |
- Download the {{: | - Download the {{: | ||
- | <code> | + | <cli> |
- | mv -i check_cx.sh / | + | $ mv -i check_cx.sh / |
- | chmod 755 / | + | $ chmod 755 / |
- | </code> < | + | </cli> < |
- Adjust the plugin settings according to your environment. Edit the following variable assignments: | - Adjust the plugin settings according to your environment. Edit the following variable assignments: | ||
< | < | ||
Line 96: | Line 96: | ||
- Define a group of services in your Nagios configuration to be checked for each Clariion device: | - Define a group of services in your Nagios configuration to be checked for each Clariion device: | ||
< | < | ||
- | # check snmptraps | ||
- | define service { | ||
- | use | ||
- | hostgroup_name | ||
- | service_description | ||
- | check_command | ||
- | } | ||
# check snmpd | # check snmpd | ||
define service { | define service { | ||
Line 145: | Line 138: | ||
check_command | check_command | ||
} | } | ||
- | </ | + | </ |
+ | Replace '' | ||
- Define a service dependency to run the check '' | - Define a service dependency to run the check '' | ||
< | < | ||
Line 173: | Line 167: | ||
parents | parents | ||
} | } | ||
- | </ | + | </ |
+ | Replace '' | ||
- Define a hostgroup in your Nagios configuration for all Clariion devices. In this example it is named '' | - Define a hostgroup in your Nagios configuration for all Clariion devices. In this example it is named '' | ||
< | < | ||
Line 183: | Line 178: | ||
</ | </ | ||
- Run a configuration check and if successful reload the Nagios process: | - Run a configuration check and if successful reload the Nagios process: | ||
+ | <cli> | ||
+ | $ / | ||
+ | $ / | ||
+ | </ | ||
+ | |||
+ | The new hosts and services should soon show up in the Nagios web interface. | ||
+ | |||
+ | If the optional step number 2 in the above list was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from Clariion devices. This can be achieved by the following steps: | ||
+ | - Convert the EMC Clariion SNMP MIB definitions in '' | ||
+ | <cli> | ||
+ | $ / | ||
+ | |||
+ | ... | ||
+ | Done | ||
+ | |||
+ | Total translations: | ||
+ | Successful translations: | ||
+ | Failed translations: | ||
+ | </ | ||
+ | - Edit the trap severity according to your requirements, | ||
+ | <cli> | ||
+ | $ vim / | ||
+ | |||
+ | ... | ||
+ | EVENT EventMonitorTrapWarn .1.3.6.1.4.1.1981.0.4 " | ||
+ | ... | ||
+ | EVENT EventMonitorTrapFault .1.3.6.1.4.1.1981.0.6 " | ||
+ | ... | ||
+ | </ | ||
+ | - **Optional**: | ||
+ | <code diff snmptt.conf.emc-clariion> | ||
+ | diff -u snmptt.conf.emc-clariion_1 snmptt.conf.emc-clariion | ||
+ | --- snmptt.conf.emc-clariion.orig | ||
+ | +++ snmptt.conf.emc-clariion | ||
+ | @@ -54,8 +54,31 @@ | ||
+ | # | ||
+ | # | ||
+ | # | ||
+ | +EVENT EventMonitorTrapError .1.3.6.1.4.1.1981.0.5 " | ||
+ | +FORMAT An Error EventMonitorTrap is generated in $* | ||
+ | +MATCH MODE=and | ||
+ | +MATCH $*: !(( Power [AB] : Faulted|Disk Array Enclosure .Bus [0-9] Enclosure [0-9]. is faulted)) | ||
+ | +MATCH $X: !(0(2: | ||
+ | +SDESC | ||
+ | +An Error EventMonitorTrap is generated in | ||
+ | +response to a user-specified event. | ||
+ | +Details can be found in Variables data. | ||
+ | +Variables: | ||
+ | + 1: hostName | ||
+ | + 2: deviceID | ||
+ | + 3: eventID | ||
+ | + 4: eventText | ||
+ | + 5: storageSystem | ||
+ | +EDESC | ||
+ | +# | ||
+ | +# Filter and ignore the following events | ||
+ | +# 02:50 - 03:15 Navisphere Power Supply Checks | ||
+ | +# | ||
+ | EVENT EventMonitorTrapError .1.3.6.1.4.1.1981.0.5 " | ||
+ | | ||
+ | +MATCH MODE=and | ||
+ | +MATCH $*: (( Power [AB] : Faulted|Disk Array Enclosure .Bus [0-9] Enclosure [0-9]. is faulted)) | ||
+ | +MATCH $X: (0(2: | ||
+ | SDESC | ||
+ | An Error EventMonitorTrap is generated in | ||
+ | | ||
+ | </ | ||
+ | |||
+ | The reason for this is, the Clariion performs a power supply check every friday around 3:00 am. This triggers a SNMP trap to be sent, even if the power supplies check out fine. In my opinion this behaviour is defective, but a case opened on this issue showed that EMC tends to think otherwise. Since there was very little hope for EMC to come to at least some sense, i just did the above patch to the SNMPTT configuration file. What it does is, it basically lowers the severity for all " | ||
+ | - Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon: | ||
+ | <cli> | ||
+ | $ vim / | ||
+ | |||
+ | ... | ||
+ | [TrapFiles] | ||
+ | snmptt_conf_files = <<END | ||
+ | ... | ||
+ | / | ||
+ | ... | ||
+ | END | ||
+ | |||
+ | $ / | ||
+ | </ | ||
+ | - Download the {{: | ||
+ | <cli> | ||
+ | $ mv -i check_snmp_traps.sh / | ||
+ | $ chmod 755 / | ||
+ | </ | ||
+ | - Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file ''/ | ||
< | < | ||
- | / | + | # check for snmp traps |
- | /etc/init.d/nagios3 reload | + | define command{ |
+ | command_name | ||
+ | command_line | ||
+ | } | ||
+ | </code> | ||
+ | Replace '' | ||
+ | - Add another service in your Nagios configuration to be checked for each Clariion device: | ||
+ | < | ||
+ | # check snmptraps | ||
+ | define service { | ||
+ | use | ||
+ | hostgroup_name | ||
+ | service_description | ||
+ | check_command | ||
+ | } | ||
</ | </ | ||
+ | - **Optional**: | ||
+ | < | ||
+ | define serviceextinfo { | ||
+ | hostgroup_name | ||
+ | service_description | ||
+ | notes SNMP Alerts | ||
+ | # | ||
+ | # | ||
+ | } | ||
+ | </ | ||
+ | Uncomment the '' | ||
+ | - Run a configuration check and if successful reload the Nagios process: | ||
+ | <cli> | ||
+ | $ / | ||
+ | $ / | ||
+ | </ | ||
- | The new hosts and services | + | All done, you should |