bityard Blog

// Check_MK Monitoring - Open-iSCSI

The Open-iSCSI project provides a high-performance, transport independent, implementation of RFC 3720 iSCSI for Linux. It allows remote access to SCSI targets via TCP/IP over several different transport technologies. This article introduces a new Check_MK service check to monitor the status of Open-iSCSI sessions as well as the monitoring of several statistical metrics on Open-iSCSI sessions and iSCSI hardware initiator hosts.

For the impatient and TL;DR here is the Check_MK package of the Open-iSCSI monitoring checks:

Open-iSCSI monitoring checks (Compatible with Check_MK versions 1.2.8 and later)

The sources are to be found in my Check_MK repository on GitHub


The Check_MK service check to monitor Open-iSCSI consists of two major parts, an agent plugin and three check plugins.

The first part, a Check_MK agent plugin named open-iscsi, is a simple Bash shell script. It calls the Open-iSCSI administration tool iscsiadm in order to retrieve a list of currently active iSCSI sessions. The exact call to iscsiadm to retrieve the session list is:

/usr/bin/iscsiadm -m session -P 1

If there are any active iSCSI sessions, the open-iscsi agent plugin also tries to collect several statistics for each iSCSI session. This is done by another call to iscsiadm for each iSCSI Session ${SID}, which is shown in the following example:

/usr/bin/iscsiadm -m session -r ${SID} -s

Unfortunately, the iSCSI session statistics are currently only supported for Open-iSCSI software initiators or dependent hardware iSCSI initiators like the Broadcom BCM577xx or BCM578xx adapters which are covered by the bnx2i kernel module. See Debugging Segfaults in Open-iSCSIs iscsiuio on Intel Broadwell and Backporting Open-iSCSI to Debian 8 "Jessie" for additional information on those dependent hardware iSCSI initiators.

For hardware iSCSI initiators, like the QLogic 4000 and QLogic 8200 Series network adapters and iSCSI HBAs, which provide a full iSCSI offload engine (iSOE) implementation in the adapters firmware, there is currently no support for iSCSI session statistics. Instead, the open-iscsi agent plugin collects several global statistics on each iSOE host ${HST} which is covered by the qla4xxx kernel module with the command shown in the following example:

/usr/bin/iscsiadm -m host -H ${HST} -C stats

The output of the above commands is parsed and reformated by the agent plugin for easier processing in the check plugins. The following example shows the agent plugin output for a system with two BCM578xx dependent hardware iSCSI initiators:

<<<open-iscsi_sessions>>>
bnx2i 10.0.3.4:3260,1 iqn.2001-05.com.equallogic:8-da6616-807572d50-5080000001758a32-<ISCSI-ALIAS> bnx2i.f8:ca:b8:7d:bf:2d eth2 10.0.3.52 LOGGED_IN LOGGED_IN NO_CHANGE
bnx2i 10.0.3.4:3260,1 iqn.2001-05.com.equallogic:8-da6616-807572d50-5080000001758a32-<ISCSI-ALIAS> bnx2i.f8:ca:b8:7d:c2:34 eth3 10.0.3.53 LOGGED_IN LOGGED_IN NO_CHANGE

<<<open-iscsi_session_stats>>>
[session stats f8:ca:b8:7d:bf:2d iqn.2001-05.com.equallogic:8-da6616-807572d50-5080000001758a32-<ISCSI-ALIAS>]
txdata_octets: 40960
rxdata_octets: 461171313
noptx_pdus: 0
scsicmd_pdus: 153967
tmfcmd_pdus: 0
login_pdus: 0
text_pdus: 0
dataout_pdus: 0
logout_pdus: 0
snack_pdus: 0
noprx_pdus: 0
scsirsp_pdus: 153967
tmfrsp_pdus: 0
textrsp_pdus: 0
datain_pdus: 112420
logoutrsp_pdus: 0
r2t_pdus: 0
async_pdus: 0
rjt_pdus: 0
digest_err: 0
timeout_err: 0

[session stats f8:ca:b8:7d:c2:34 iqn.2001-05.com.equallogic:8-da6616-807572d50-5080000001758a32-<ISCSI-ALIAS>]
txdata_octets: 16384
rxdata_octets: 255666052
noptx_pdus: 0
scsicmd_pdus: 84312
tmfcmd_pdus: 0
login_pdus: 0
text_pdus: 0
dataout_pdus: 0
logout_pdus: 0
snack_pdus: 0
noprx_pdus: 0
scsirsp_pdus: 84312
tmfrsp_pdus: 0
textrsp_pdus: 0
datain_pdus: 62418
logoutrsp_pdus: 0
r2t_pdus: 0
async_pdus: 0
rjt_pdus: 0
digest_err: 0
timeout_err: 0

The next example shows the agent plugin output for a system with two QLogic 8200 Series hardware iSCSI initiators:

<<<open-iscsi_sessions>>>
qla4xxx 10.0.3.4:3260,1 iqn.2001-05.com.equallogic:8-da6616-57e572d50-80e0000001458a32-v-sto2-tst-000001 qla4xxx.f8:ca:b8:7d:c1:7d.ipv4.0 none 10.0.3.50 LOGGED_IN Unknown Unknown
qla4xxx 10.0.3.4:3260,1 iqn.2001-05.com.equallogic:8-da6616-57e572d50-80e0000001458a32-v-sto2-tst-000001 qla4xxx.f8:ca:b8:7d:c1:7e.ipv4.0 none 10.0.3.51 LOGGED_IN Unknown Unknown

<<<open-iscsi_host_stats>>>
[host stats f8:ca:b8:7d:c1:7d iqn.2000-04.com.qlogic:isp8214.000e1e3574ac.4]
mactx_frames: 563454
mactx_bytes: 52389948
mactx_multicast_frames: 877513
mactx_broadcast_frames: 0
mactx_pause_frames: 0
mactx_control_frames: 0
mactx_deferral: 0
mactx_excess_deferral: 0
mactx_late_collision: 0
mactx_abort: 0
mactx_single_collision: 0
mactx_multiple_collision: 0
mactx_collision: 0
mactx_frames_dropped: 0
mactx_jumbo_frames: 0
macrx_frames: 1573455
macrx_bytes: 440845678
macrx_unknown_control_frames: 0
macrx_pause_frames: 0
macrx_control_frames: 0
macrx_dribble: 0
macrx_frame_length_error: 0
macrx_jabber: 0
macrx_carrier_sense_error: 0
macrx_frame_discarded: 0
macrx_frames_dropped: 1755017
mac_crc_error: 0
mac_encoding_error: 0
macrx_length_error_large: 0
macrx_length_error_small: 0
macrx_multicast_frames: 0
macrx_broadcast_frames: 0
iptx_packets: 508160
iptx_bytes: 29474232
iptx_fragments: 0
iprx_packets: 401785
iprx_bytes: 354673156
iprx_fragments: 0
ip_datagram_reassembly: 0
ip_invalid_address_error: 0
ip_error_packets: 0
ip_fragrx_overlap: 0
ip_fragrx_outoforder: 0
ip_datagram_reassembly_timeout: 0
ipv6tx_packets: 0
ipv6tx_bytes: 0
ipv6tx_fragments: 0
ipv6rx_packets: 0
ipv6rx_bytes: 0
ipv6rx_fragments: 0
ipv6_datagram_reassembly: 0
ipv6_invalid_address_error: 0
ipv6_error_packets: 0
ipv6_fragrx_overlap: 0
ipv6_fragrx_outoforder: 0
ipv6_datagram_reassembly_timeout: 0
tcptx_segments: 508160
tcptx_bytes: 19310736
tcprx_segments: 401785
tcprx_byte: 346637456
tcp_duplicate_ack_retx: 1
tcp_retx_timer_expired: 1
tcprx_duplicate_ack: 0
tcprx_pure_ackr: 0
tcptx_delayed_ack: 106449
tcptx_pure_ack: 106489
tcprx_segment_error: 0
tcprx_segment_outoforder: 0
tcprx_window_probe: 0
tcprx_window_update: 695915
tcptx_window_probe_persist: 0
ecc_error_correction: 0
iscsi_pdu_tx: 401697
iscsi_data_bytes_tx: 29225
iscsi_pdu_rx: 401697
iscsi_data_bytes_rx: 327355963
iscsi_io_completed: 101
iscsi_unexpected_io_rx: 0
iscsi_format_error: 0
iscsi_hdr_digest_error: 0
iscsi_data_digest_error: 0
iscsi_sequence_error: 0

[host stats f8:ca:b8:7d:c1:7e iqn.2000-04.com.qlogic:isp8214.000e1e3574ad.5]
mactx_frames: 563608
mactx_bytes: 52411412
mactx_multicast_frames: 877517
mactx_broadcast_frames: 0
mactx_pause_frames: 0
mactx_control_frames: 0
mactx_deferral: 0
mactx_excess_deferral: 0
mactx_late_collision: 0
mactx_abort: 0
mactx_single_collision: 0
mactx_multiple_collision: 0
mactx_collision: 0
mactx_frames_dropped: 0
mactx_jumbo_frames: 0
macrx_frames: 1573572
macrx_bytes: 441630442
macrx_unknown_control_frames: 0
macrx_pause_frames: 0
macrx_control_frames: 0
macrx_dribble: 0
macrx_frame_length_error: 0
macrx_jabber: 0
macrx_carrier_sense_error: 0
macrx_frame_discarded: 0
macrx_frames_dropped: 1755017
mac_crc_error: 0
mac_encoding_error: 0
macrx_length_error_large: 0
macrx_length_error_small: 0
macrx_multicast_frames: 0
macrx_broadcast_frames: 0
iptx_packets: 508310
iptx_bytes: 29490504
iptx_fragments: 0
iprx_packets: 401925
iprx_bytes: 355436636
iprx_fragments: 0
ip_datagram_reassembly: 0
ip_invalid_address_error: 0
ip_error_packets: 0
ip_fragrx_overlap: 0
ip_fragrx_outoforder: 0
ip_datagram_reassembly_timeout: 0
ipv6tx_packets: 0
ipv6tx_bytes: 0
ipv6tx_fragments: 0
ipv6rx_packets: 0
ipv6rx_bytes: 0
ipv6rx_fragments: 0
ipv6_datagram_reassembly: 0
ipv6_invalid_address_error: 0
ipv6_error_packets: 0
ipv6_fragrx_overlap: 0
ipv6_fragrx_outoforder: 0
ipv6_datagram_reassembly_timeout: 0
tcptx_segments: 508310
tcptx_bytes: 19323952
tcprx_segments: 401925
tcprx_byte: 347398136
tcp_duplicate_ack_retx: 2
tcp_retx_timer_expired: 4
tcprx_duplicate_ack: 0
tcprx_pure_ackr: 0
tcptx_delayed_ack: 106466
tcptx_pure_ack: 106543
tcprx_segment_error: 0
tcprx_segment_outoforder: 0
tcprx_window_probe: 0
tcprx_window_update: 696035
tcptx_window_probe_persist: 0
ecc_error_correction: 0
iscsi_pdu_tx: 401787
iscsi_data_bytes_tx: 37970
iscsi_pdu_rx: 401791
iscsi_data_bytes_rx: 328112050
iscsi_io_completed: 127
iscsi_unexpected_io_rx: 0
iscsi_format_error: 0
iscsi_hdr_digest_error: 0
iscsi_data_digest_error: 0
iscsi_sequence_error: 0

Although a simple Bash shell script, the agent plugin open-iscsi has several dependencies which need to be installed in order for the agent plugin to work properly. Namely those are the commands iscsiadm, sed, tr and egrep. On Debian based systems, the necessary packages can be installed with the following command:

root@host:~# apt-get install coreutils grep open-iscsi sed

The second part of the Check_MK service check for Open-iSCSI provides the necessary check logic through individual inventory and check functions. This is implemented in the three Check_MK check plugins open-iscsi_sessions, open-iscsi_host_stats and open-iscsi_session_stats, which will be discussed separately in the following sections.

Open-iSCSI Session Status

The check plugin open-iscsi_sessions is responsible for the monitoring of individual iSCSI sessions and their internal session states. Upon inventory this check plugin creates a service check for each pair of iSCSI network interface name and IQN of the iSCSI target volume. Unlike the iSCSI session ID, which changes over time (e.g. after iSCSI logout and login), this pair uniquely identifies a iSCSI session on a host. During normal check execution, the list of currently active iSCSI sessions on a host is compared to the list of active iSCSI sessions gathered during inventory on that host. If a session is missing or if the session has an erroneous internal state, an alarm is raised accordingly.

For all types of initiators – software, dependent hardware and hardware – there is the state session_state which can take on the following values:

ISCSI_STATE_FREE
ISCSI_STATE_LOGGED_IN
ISCSI_STATE_FAILED
ISCSI_STATE_TERMINATE
ISCSI_STATE_IN_RECOVERY
ISCSI_STATE_RECOVERY_FAILED
ISCSI_STATE_LOGGING_OUT

An alarm is raised if the session is in any state other than ISCSI_STATE_LOGGED_IN. For software and dependent hardware initiators there are two additional states – connection_state and internal_state. The state connection_state can take on the values:

FREE
TRANSPORT WAIT
IN LOGIN
LOGGED IN
IN LOGOUT
LOGOUT REQUESTED
CLEANUP WAIT

and internal_state can take on the values:

NO CHANGE
CLEANUP       
REOPEN
REDIRECT

In addition to the above session_state, an alarm is raised if the connection_state is in any other state than LOGGED IN and internal_state is in any other state than NO CHANGE.

No performance data is currently reported by this check.

Open-iSCSI Hosts Statistics

The check plugin open-iscsi_host_stats is responsible for the monitoring of the global statistics on a iSOE host. Upon inventory this check plugin creates a service check for each pair of MAC address and iSCSI network interface name. During normal check execution, an extensive list of statistics – see the above example output of the Check_MK agent plugin – is determined for each inventorized item. If the rate of one of the statistics values is above the configured warning and critical threshold values, an alarm is raised accordingly. For all statistics, performance data is reported by the check.

With the additional WATO plugin open-iscsi_host_stats.py it is possible to configure the warning and critical levels through the WATO WebUI and thus override the default values. The default values for all statistics are a rate of zero (0) units per second for both warning and critical thresholds. The configuration options for the iSOE host statistics levels can be found in the WATO WebUI under:

-> Host & Service Parameters
   -> Parameters for discovered services
      -> Storage, Filesystems and Files
         -> Open-iSCSI Host Statistics
            -> Create Rule in Folder ...
               -> The levels for the Open-iSCSI host statistics values
                  [x] The levels for the number of transmitted MAC/Layer2 frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 bytes on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 bytes on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 multicast frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 multicast frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 broadcast frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 broadcast frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 pause frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 pause frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 control frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 control frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 dropped frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 dropped frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 deferral frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 deferral frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 abort frames on an iSOE host.
                  [x] The levels for the number of transmitted MAC/Layer2 jumbo frames on an iSOE host.
                  [x] The levels for the number of MAC/Layer2 late transmit collisions on an iSOE host.
                  [x] The levels for the number of MAC/Layer2 single transmit collisions on an iSOE host.
                  [x] The levels for the number of MAC/Layer2 multiple transmit collisions on an iSOE host.
                  [x] The levels for the number of MAC/Layer2 collisions on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 control frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 dribble on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 frame length errors on an iSOE host.
                  [x] The levels for the number of discarded received MAC/Layer2 frames on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 jabber on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 carrier sense errors on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 CRC errors on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 encoding errors on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 length too large errors on an iSOE host.
                  [x] The levels for the number of received MAC/Layer2 length too small errors on an iSOE host.
                  [x] The levels for the number of transmitted IP packets on an iSOE host.
                  [x] The levels for the number of received IP packets on an iSOE host.
                  [x] The levels for the number of transmitted IP bytes on an iSOE host.
                  [x] The levels for the number of received IP bytes on an iSOE host.
                  [x] The levels for the number of transmitted IP fragments on an iSOE host.
                  [x] The levels for the number of received IP fragments on an iSOE host.
                  [x] The levels for the number of IP datagram reassemblies on an iSOE host.
                  [x] The levels for the number of IP invalid address errors on an iSOE host.
                  [x] The levels for the number of IP packet errors on an iSOE host.
                  [x] The levels for the number of IP fragmentation overlaps on an iSOE host.
                  [x] The levels for the number of IP fragmentation out-of-order on an iSOE host.
                  [x] The levels for the number of IP datagram reassembly timeouts on an iSOE host.
                  [x] The levels for the number of transmitted IPv6 packets on an iSOE host.
                  [x] The levels for the number of received IPv6 packets on an iSOE host.
                  [x] The levels for the number of transmitted IPv6 bytes on an iSOE host.
                  [x] The levels for the number of received IPv6 bytes on an iSOE host.
                  [x] The levels for the number of transmitted IPv6 fragments on an iSOE host.
                  [x] The levels for the number of received IPv6 fragments on an iSOE host.
                  [x] The levels for the number of IPv6 datagram reassemblies on an iSOE host.
                  [x] The levels for the number of IPv6 invalid address errors on an iSOE host.
                  [x] The levels for the number of IPv6 packet errors on an iSOE host.
                  [x] The levels for the number of IPv6 fragmentation overlaps on an iSOE host.
                  [x] The levels for the number of IPv6 fragmentation out-of-order on an iSOE host.
                  [x] The levels for the number of IPv6 datagram reassembly timeouts on an iSOE host.
                  [x] The levels for the number of transmitted TCP segments on an iSOE host.
                  [x] The levels for the number of received TCP segments on an iSOE host.
                  [x] The levels for the number of transmitted TCP bytes on an iSOE host.
                  [x] The levels for the number of received TCP bytes on an iSOE host.
                  [x] The levels for the number of duplicate TCP ACK retransmits on an iSOE host.
                  [x] The levels for the number of received TCP retransmit timer expiries on an iSOE host.
                  [x] The levels for the number of received TCP duplicate ACKs on an iSOE host.
                  [x] The levels for the number of received TCP pure ACKs on an iSOE host.
                  [x] The levels for the number of transmitted TCP delayed ACKs on an iSOE host.
                  [x] The levels for the number of transmitted TCP pure ACKs on an iSOE host.
                  [x] The levels for the number of received TCP segment errors on an iSOE host.
                  [x] The levels for the number of received TCP segment out-of-order on an iSOE host.
                  [x] The levels for the number of received TCP window probe on an iSOE host.
                  [x] The levels for the number of received TCP window update on an iSOE host.
                  [x] The levels for the number of transmitted TCP window probe persist on an iSOE host.
                  [x] The levels for the number of transmitted iSCSI PDUs on an iSOE host.
                  [x] The levels for the number of received iSCSI PDUs on an iSOE host.
                  [x] The levels for the number of transmitted iSCSI Bytes on an iSOE host.
                  [x] The levels for the number of received iSCSI Bytes on an iSOE host.
                  [x] The levels for the number of iSCSI I/Os completed on an iSOE host.
                  [x] The levels for the number of iSCSI unexpected I/Os on an iSOE host.
                  [x] The levels for the number of iSCSI format errors on an iSOE host.
                  [x] The levels for the number of iSCSI header digest (CRC) errors on an iSOE host.
                  [x] The levels for the number of iSCSI data digest (CRC) errors on an iSOE host.
                  [x] The levels for the number of iSCSI sequence errors on an iSOE host.
                  [x] The levels for the number of ECC error corrections on an iSOE host.

The following image shows a status output example from the WATO WebUI with several open-iscsi_sessions (iSCSI Session Status) and open-iscsi_host_stats (iSCSI Host Stats) service checks over two QLogic 8200 Series hardware iSCSI initiators:

Status output example for open-iscsi_sessions and open-iscsi_host_stats service checks over QLogic 8200 Series hardware iSCSI initiators

This example shows six iSCSI Session Status service check items, which are pairs of iSCSI network interface names and – here anonymized – IQNs of the iSCSI target volumes. For each item the current session_state – in this example LOGGED_IN – is shown. There are also two iSCSI Host Stats service check items in the example, which are pairs of MAC addresses and iSCSI network interface names. For each of those items the current throughput rate on the MAC, IP/IPv6, TCP and iSCSI protocol layer is shown. The throughput rate on the MAC protocol layer is also visualized in the Perf-O-Meter, received traffic growing from the middle to the left, transmitted traffic growing from the middle to the right.

The following three images show examples of the PNP4Nagios graphs for the open-iscsi_host_stats (iSCSI Host Stats) service check.

Example PNP4Nagios graphs for a open-iscsi_host_stats service check (MAC Frames, Traffic, MAC Errors)

The middle graph shows a combined view of the throughput rate for received and transmitted traffic on the different MAC, IP/IPv6, TCP and iSCSI protocol layers. The upper graph shows the throughput rate for various frame types on the MAC protocol layer. The lower graph shows the rate for various error frame types on the MAC protocol layer.

Example PNP4Nagios graphs for a open-iscsi_host_stats service check (IP Packets and Fragments, IP Errors, TCP Segments)

The upper graph shows the throughput rate for received and transmitted traffic on the IP/IPv6 protocol layer. The middle graph shows the rate for various error packet types on the IP/IPv6 protocol layer. The lower graph shows the throughput rate for received and transmitted traffic on the TCP protocol layer.

Example PNP4Nagios graphs for a open-iscsi_host_stats service check (TCP Errors, ECC Error Correction, iSCSI PDUs, iSCSI Errors)

The first graph shows the rate for various protocol control and error segment types on the TCP protocol layer. The second graph shows the rate of ECC error corrections that occured on the QLogic 8200 Series hardware iSCSI initiator. The third graph shows the throughput rate for received and transmitted traffic on the iSCSI protocol layer. The fourth and last graph shows the rate for various control and error PDUs on the iSCSI protocol layer.

Open-iSCSI Session Statistics

The check plugin open-iscsi_session_stats is responsible for the monitoring of the statistics on individual iSCSI sessions. Upon inventory this check plugin creates a service check for each pair of MAC address of the network interface and IQN of the iSCSI target volume. During normal check execution, an extensive list of statistics – see the above example output of the Check_MK agent plugin – is collected for each inventorized item. If the rate of one of the statistics values is above the configured warning and critical threshold values, an alarm is raised accordingly. For all statistics, performance data is reported by the check.

With the additional WATO plugin open-iscsi_session_stats.py it is possible to configure the warning and critical levels through the WATO WebUI and thus override the default values. The default values for all statistics are a rate of zero (0) units per second for both warning and critical thresholds. The configuration options for the iSCSI session statistics levels can be found in the WATO WebUI under:

-> Host & Service Parameters
   -> Parameters for discovered services
      -> Storage, Filesystems and Files
         -> Open-iSCSI Session Statistics
            -> Create Rule in Folder ...
               -> The levels for the Open-iSCSI session statistics values
                  [x] The levels for the number of transmitted bytes in an Open-iSCSI session
                  [x] The levels for the number of received bytes in an Open-iSCSI session
                  [x] The levels for the number of digest (CRC) errors in an Open-iSCSI session
                  [x] The levels for the number of timeout errors in an Open-iSCSI session
                  [x] The levels for the number of transmitted NOP commands in an Open-iSCSI session
                  [x] The levels for the number of received NOP commands in an Open-iSCSI session
                  [x] The levels for the number of transmitted SCSI command requests in an Open-iSCSI session
                  [x] The levels for the number of received SCSI command reponses in an Open-iSCSI session
                  [x] The levels for the number of transmitted task management function commands in an Open-iSCSI session
                  [x] The levels for the number of received task management function responses in an Open-iSCSI session
                  [x] The levels for the number of transmitted login requests in an Open-iSCSI session
                  [x] The levels for the number of transmitted logout requests in an Open-iSCSI session
                  [x] The levels for the number of received logout responses in an Open-iSCSI session
                  [x] The levels for the number of transmitted text PDUs in an Open-iSCSI session
                  [x] The levels for the number of received text PDUs in an Open-iSCSI session
                  [x] The levels for the number of transmitted data PDUs in an Open-iSCSI session
                  [x] The levels for the number of received data PDUs in an Open-iSCSI session
                  [x] The levels for the number of transmitted single negative ACKs in an Open-iSCSI session
                  [x] The levels for the number of received ready to transfer PDUs in an Open-iSCSI session
                  [x] The levels for the number of received reject PDUs in an Open-iSCSI session
                  [x] The levels for the number of received asynchronous messages in an Open-iSCSI session

The following image shows a status output example from the WATO WebUI with several open-iscsi_sessions (iSCSI Session Status) and open-iscsi_session_stats (iSCSI Session Stats) service checks over two BCM578xx dependent hardware iSCSI initiators:

Status output example for open-iscsi_sessions and open-iscsi_session_stats service checks over BCM578xx dependent hardware iSCSI initiators

This example shows six iSCSI Session Status service check items, which are pairs of iSCSI network interface names and – here anonymized – IQNs of the iSCSI target volumes. For each item the current session_state, connection_state and internal_state – in this example with the respective values LOGGED_IN, LOGGED_IN and NO_CHANGE – are shown. There are also an equivalent number of iSCSI Session Stats service check items in the example, which are also pairs of MAC addresses of the network interfaces and IQNs of the iSCSI target volumes. For each of those items the current throughput rate of the individual iSCSI session is shown. As long as the rate of the digest (CRC) and timeout error counters is zero, the string no protocol errors is displayed. Otherwise the name and throughput rate of any non-zero error counter is shown. The throughput rate of the iSCSI session is also visualized in the Perf-O-Meter, received traffic growing from the middle to the left, transmitted traffic growing from the middle to the right.

The following image shows an example of the three PNP4Nagios graphs for a single open-iscsi_session_stats (iSCSI Session Stats) service check.

Example PNP4Nagios graphs for a single open-iscsi_session_stats service check

The upper graph shows the throughput rate for received and transmitted traffic of the iSCSI session. The middle graph shows the rate for received and transmitted iSCSI PDUs, broken down by the different types of PDUs on the iSCSI protocol layer. The lower graph shows the rate for the digest (CRC) and timeout errors on the iSCSI protocol layer.

The described Check_MK service check to monitor the status of Open-iSCSI sessions, Open-iSCSI session metrics and iSCSI hardware initiator host metrics has been verified to work with version 2.0.874-2~bpo8+1 of the open-iscsi package from the backports repository of Debian stable (Jessie) on the client side and the Check_MK versions 1.2.6 and 1.2.8 on the server side.

I hope you find the provided new check useful and enjoyed reading this blog post. Please don't hesitate to drop me a note if you have any suggestions or run into any issues with the provided checks.

This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website. More information about cookies