====== Cacti Monitoring Templates for TMS RamSan-630 and RamSan-810 ====== This is an update to the previous post about [[/blog/2012/06/10/cacti_tms_ramsan630|Cacti Monitoring Templates and Nagios Plugin for TMS RamSan-630]]. With the two new RamSan-810 we got and the new firmware releases available for our existing RamSan-630, an update to the previously introduced Cacti templates and Nagios plugins seemed to be in order. The good news is, the new Cacti templates can still be used for older firmware versions, the graphs depending on newer performance counters will just remain empty. I suspect they'll work for all 6x0, 7x0 and 8x0 models. Also good news, the RamSan-630 and the RamSan-810 have basically the same SNMP MIB: * {{:2013:01:27:TMS_RamSAN-630_v5.4.2.mib|TMS RamSan-630 MIB (Firmware v5.4.2)}} < * {{:2013:01:27:TMS_RamSAN-630_v5.4.8.mib|TMS RamSan-630 MIB (Firmware v5.4.8)}} < * {{:2013:01:27:TMS_RamSAN-810_v5.5.2.mib|TMS RamSan-810 MIB (Firmware v5.5.2)}} < There are just some nomencalture differences with regard to the product name, so the same Cacti templates can be used for either RamSan-630 or RamSan-810 systems. For historic reasons the string "TMS RamSan-630" still appears in several template names. As the release notes for current firmware versions mention, several new SNMP counters have been added: ** Release 5.4.6 - May 17, 2012 ** [N 23014] SNMP MIB now includes a new table for flashcard information. ** Release 5.4.5 - May 2, 2012 ** [N 23014] SNMP MIB now includes interface stats for transfer latency and DMA command sizes. A diff on the two RamSan-630 MIBs mentioned above shows the new SNMP counters: fcReadAvgLatency fcWriteAvgLatency fcReadMaxLatency fcWriteMaxLatency fcReadSampleLow fcReadSampleMed fcReadSampleHigh fcWriteSampleLow fcWriteSampleMed fcWriteSampleHigh fcscsi4k fcscsi8k fcscsi16k fcscsi32k fcscsi64k fcscsi128k fcscsi256k fcRMWCount flashTableIndex flashObject flashTableState flashHealthState flashHealthPercent flashSizeMiB With a little bit of reading through the MIB and comparing the new SNMP counters to the corresponding performance counters in the RamSan web interface, the following metrics were added to the Cacti templates: * FC port average and maximum read and write latency measured in microseconds. Example RamSan-630 Average and Maximum Read/Write Latency on Port fc-1a: {{:2013:01:27:tms_ramsan630_latency.png|Example RamSan-630 Average and Maximum Read/Write Latency on Port fc-1a}} Example RamSan-810 Average and Maximum Read/Write Latency on Port fc-1a: {{:2013:01:27:tms_ramsan810_latency.png|Example RamSan-810 Average and Maximum Read/Write Latency on Port fc-1a}} < * FC port SCSI command count grouped by the SCSI command size. Example RamSan-630 SCSI Command Count on Port fc-1a: {{:2013:01:27:tms_ramsan630_scsi_cmd_count.png|Example RamSan-630 SCSI Command Count on Port fc-1a}} Example RamSan-810 SCSI Command Count on Port fc-1a: {{:2013:01:27:tms_ramsan810_scsi_cmd_count.png|Example RamSan-810 SCSI Command Count on Port fc-1a}} < * FC port SCSI command latency grouped by latency classes (low, medium, high). Example RamSan-630 SCSI Command Latency on Port fc-1a: {{:2013:01:27:tms_ramsan630_scsi_cmd_latency.png|Example RamSan-630 SCSI Command Latency on Port fc-1a}} Example RamSan-810 SCSI Command Latency on Port fc-1a: {{:2013:01:27:tms_ramsan810_scsi_cmd_latency.png|Example RamSan-810 SCSI Command Latency on Port fc-1a}} < * FC port read-modify-write command count (although they seem to remain at the maximum value for 32bit signed integer all the time). < * Flashcard health percentage (good vs. failed flash cells). Example Health Status of Flashcard flashcard-1: {{:2013:01:27:tms_ramsan630_flashcard_health.png|Example Health Status of Flashcard flashcard-1}} < * Flashcard size. < There still seem to be some issues with the existing and the new SNMP counters. For example the ''fcCacheHit'', ''fcCacheMiss'' and ''fcCacheLookup'' counters always remain at a zero value. The ''fcRXFrames'' counter always stays at the same value (2147483647), which is the maximum for a 32bit signed integer and could suggest a counter overflow. The ''fcWriteSample*'' counters also seem to remain at a zero value even though the corresponding performance counters in the RamSan web interface show a steady growth. Since there are still some performance counters left that are only accessible via the web interface, there's still some room for improofment. I hope with the aquisition by IBM we'll see some more and interesting changes in the future. The Nagios plugins and the updated Cacti templates can be downloaded here {{:2013:01:27:tms_ramsan-630_cacti_nagios.tar.bz2|Nagios Plugin and Cacti Templates}}.