2015-05-11 // IBM SVC and experiences with Real-time Compression - VMware and MS SQL Server
Following up on the previous article (IBM SVC and experiences with Real-time Compression) about our experiences with Real-time compression (RTC) in a IBM SVC environment, we did some additional and further testing with regard to RTC being used in a VMware environment hosting soley database instances of Microsoft SQL Server.
With the implementation of the seperation of all of our Microsoft SQL Server instances into an own VMware ESX cluster – which had to be done due to licensing constraints – an opportunity presented itself, to take a bit more detailed look into how well the SVCs RTC feature would perform on this particular type of workload.
Usually, the contents of a VMware datastore are by their very nature opaque to the lower levels of the storage hierarchy like the SVC which is providing the block storage for the datastores. On the one hand an advantage, this lack of transparency on the other hand also prohibits the SVC from providing a fine grained, per VM set of its usual statistics (e.g. the actual amount of storage used with thin-provisioning or RTC enabled, the amount of storage in the individual storage tier classes, etc.) and performance metrics. By not only moving the compute part of the VMs over to the new VMware ESX hosts in a seperate VMware ESX cluster, but also individually moving the storage part of one VM after the other onto new and empty datastores, which in turn were at the SVC based on newly created, empty VDisks with RTC enabled, we were able to measure the impact of RTC on a per VM level. It was a bit of tedious work and the precision of the numbers has to be taken with a grain of salt because I/O was concurrently happening on the database and operating system level, probably causing additional extends to be allocated and thus some inaccuracy to a small extent.
The environment consists of the following four datastores, four VDisks and a total number of 43 VMs running various versions of Microsoft SQL Server:
VM Datastore | SVC VDisk | Datastore Size (GB) | Number of VMs |
---|---|---|---|
san-ssb-sql-oben-35C | sas-4096G-35C | 4096 | 16 |
san-ssb-sql-unten-360 | sas-4096G-360 | 4096 | 9 |
san-ssb-sql-unten-361 | sas-2048G-361 | 2048 | 8 |
san-ssb-sql-oben-362 | sas-2048G-362 | 2048 | 10 |
Sum | 12288 | 43 |
Usually the VMs are configured with one or two disks or drives for the OS and one or more disks or drives dedicated to the database contents. In some legacy cases, specifically VM #7, #10 and #12 on datastore “san-ssb-sql-oben-35C
”, VM #6 on datastore “san-ssb-sql-unten-360
” and VM #8 on datastore “san-ssb-sql-unten-361
”, there is also a local application instance – in our case different versions of SAP – installed on the system. The operating systems used are Windows Server 2003, 2008R2 and 2012R2.
For each of the four datastores or VDisks, the results of the tests are shown in the following four sections. Each section consists of a table representation of the datastore as it was being populated with VMs in chronological order. The columns of the tables are pretty self-explainatory, with the most interesting columns being “Rel. Compressed Size”, “Rel. Compression Saving” and “Compression Ratio”. If JavaScript is enabled in your browser, click on a tables column headers to sort by that particular column. Each section also contains a graph of the absolute allocation values (table columns “VM Provisioned”, “VM Used” and “SVC Used”) per VM and a graph of the relative allocation values (values from the table column “SVC Used” divided by values from the table column “VM Used” in %) per VM. The latter ones also feature the “Compression Ratio” metric which is graphed against a second y-axis on the right-hand side of the graphs.
Datastore: san-ssb-sql-oben-35C
VDisk: sas-4096G-35CVM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment Empty Datastore – – 6.81 MB 6.81 MB – – – – Number 1 134.11 44.00 20.78 20.10 45.68 54.32 2.19 10.50.4000.0 Number 2 144.11 55.04 48.45 28.35 51.51 48.49 1.94 11.0.5058.0 Number 3 94.18 49.79 75.88 27.43 55.09 44.91 1.82 11.0.5058.0 – – – 76.30 0.42 – – – – Number 4 454.11 52.22 92.47 16.17 30.97 69.03 3.23 10.50.4000.0 Number 5 456.11 391.14 337.74 245.27 62.71 37.29 1.59 10.50.4000.0 SharePoint with file uploads Number 6 238.14 156.29 450.89 113.15 72.40 27.60 1.38 10.50.6000.34 HR Applicant Management Number 7 488.11 304.32 660.05 209.16 68.73 31.27 1.45 10.50.4000.0 SQL Page Compression Number 8 348.09 259.32 703.47 43.42 16.74 83.26 5.97 9.00.4035.00 Number 9 185.24 149.33 744.68 41.21 27.60 72.40 3.62 10.50.4000.0 Number 10 292.12 104.31 804.14 59.46 57.00 43.00 1.75 11.0.3349.0 SQL Page Compression Number 11 214.11 148.73 882.44 78.30 52.65 47.35 1.90 10.50.2500.0 Number 12 496.09 463.15 993.53 111.09 23.99 76.01 4.17 9.00.3228.00 Number 13 202.12 176.96 1058.76 65.23 36.86 63.14 2.71 9.00.4035.00 Number 14 442.41 363.83 1200.11 141.35 38.85 61.15 2.57 11.0.3412.0 SQL Page Compression Number 15 395.38 374.78 1300.00 97.96 26.14 73.86 3.83 11.0.3412.0 SQL Page Compression Number 16 739.31 464.30 1520.00 220.00 47.38 52.62 2.11 11.0.3412.0 SQL Page Compression Sum or Average 5189.63 3513.51 1497.55 42.10 57.90 2.38 Datastore: san-ssb-sql-unten-360
VDisk: sas-4096G-360VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment Empty Datastore – – 5.47 MB 5.47 MB – – – – Number 1 377.68 241.77 98.13 97.58 40.36 59.64 2.48 11.0.3412.0 SQL Page Compression Number 2 439.29 256.54 198.76 100.63 39.23 60.77 2.55 11.0.3412.0 SQL Page Compression Number 3 776.76 507.24 448.04 249.28 49.14 50.86 2.03 11.0.3412.0 SQL Page Compression Number 4 195.21 154.10 533.52 85.48 55.47 44.53 1.80 11.0.5058.0 Number 5 34.11 23.91 546.66 13.14 54.96 45.04 1.82 9.00.4035.00 Number 6 497.11 394.46 718.35 171.69 43.53 56.47 2.30 10.50.1702.0 SQL Page Compression Number 7 219.91 53.87 740.20 21.85 40.56 59.44 2.47 11.0.3000.0 Number 8 276.11 238.67 850.77 110.57 46.33 53.67 2.16 10.50.2500.0 Number 9 228.11 188.26 986.25 135.48 71.96 28.04 1.39 10.50.2500.0 HR Applicant Management Sum or Average 3044.29 2058.82 986.25 47.90 52.10 2.09 Datastore: san-ssb-sql-unten-361
VDisk: sas-2048G-361VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment Empty Datastore – – 6.66 MB 6.66 MB – – – – Number 1 188.11 162.98 51.46 51.45 31.57 68.43 3.17 10.50.4000.0 Number 2 180.31 130.82 90.79 39.33 30.06 69.94 3.33 10.50.4000.0 Number 3 178.11 142.63 131.17 40.38 28.31 71.69 3.53 10.50.4000.0 Number 4 156.09 127.26 177.30 46.13 36.25 63.75 2.76 9.00.4035.00 Number 5 100.86 90.40 202.76 25.46 28.16 71.84 3.55 9.00.4035.00 – – – 205.91 3.15 – – – – Number 6 139.09 79.17 233.85 27.94 35.29 64.71 2.83 11.0.5058.0 Number 7 34.11 24.16 246.37 12.52 51.82 48.18 1.93 9.00.4035.00 Number 8 696.32 289.09 402.07 155.07 53.86 46.14 1.86 11.0.5058.0 SQL Page Compression Sum or Average 1673.00 1046.51 402.07 38.42 61.58 2.60 Datastore: san-ssb-sql-oben-362
VDisk: sas-2048G-362VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment Empty Datastore – – – – – – – – Number 1 218.09 37.19 19.93 19.93 53.59 46.41 1.87 9.00.4035.00 Number 2 122.12 112.53 62.32 42.39 37.67 62.33 2.65 10.50.1600.1 Number 3 273.62 77.70 97.04 34.72 44.68 55.32 2.24 10.50.4000.0 Number 4 143.09 37.98 116.47 19.43 51.16 48.84 1.95 11.0.5058.0 Number 5 137.43 115.29 157.45 40.98 35.55 64.45 2.81 10.50.4000.0 Number 6 80.09 76.04 178.76 21.31 28.02 71.98 3.57 9.00.4035.00 – – – 182.30 3.54 – – – – Number 7 136.09 117.59 229.95 47.65 40.52 59.48 2.47 9.00.4035.00 Number 8 194.11 53.55 253.40 23.45 43.79 56.21 2.28 10.50.2500.0 Number 9 78.11 52.60 273.07 19.67 37.40 62.60 2.67 – Number 10 53.11 31.42 287.33 14.26 45.39 54.61 2.20 10.50.4000.0 Sum or Average 1435.86 711.89 283.82 40.36 59.64 2.48
From the detailed tabular representation of each datastore on a per VM basis that is shown above, an aggregated and summarized view, shown in the table below was created. The columns of the table are again pretty self-explainatory. Some of the less interesting columns from the detailed tables above have been dropped in the aggregated result table. The most interesting columns are again “Rel. Compressed Size”, “Rel. Compression Saving” and “Compression Ratio”. Again, if JavaScript is enabled in your browser, click on a tables column headers to sort by that particular column.
VM Datastore | SVC VDisk | VM Provisioned (GB) | VM Used (GB) | SVC Used (GB) | Rel. Compressed Size (%) | Rel. Compression Saving (%) | Compression Ratio |
---|---|---|---|---|---|---|---|
san-ssb-sql-oben-35C | sas-4096G-35C | 5189.63 | 3513.51 | 1497.55 | 42.10 | 57.90 | 2.38 |
san-ssb-sql-unten-360 | sas-4096G-360 | 3044.29 | 2058.82 | 986.25 | 47.90 | 52.10 | 2.09 |
san-ssb-sql-unten-361 | sas-2048G-361 | 976.68 | 757.42 | 243.21 | 32.53 | 67.47 | 3.07 |
san-ssb-sql-oben-362 | sas-2048G-362 | 1435.86 | 711.89 | 283.82 | 40.36 | 59.64 | 2.48 |
Sum or Average | 10646.46 | 7041.64 | 3010.83 | 42.76 | 57.24 | 2.34 |
On the whole, the compression results achived with RTC at SVC level are quite good. Considering the fact that VMware already achieved a sizeable amount of storage reduction through its own thin-provisioning algorithms, the average RTC compression ratio of 2.34 – with a minimum of 1.38 and a maximum of 5.97 – at the SVC is even more impressive. This means that on average only 42.76% of the storage space allocated by VMware is actually used on the SVC level to store the data. Taking VMware thin-provisioning into account, the relative amount of actual storage space needed sinks even further down to 28.28%, or in other words a reduction by a factor of 3.54.
Looking at the individual graphs offers several other interesting insights, which can be indicators for further investigations on the database layer. In the graphs showing the relative allocation values and titled “SVC Real-Time Compression Relative Storage Allocation and Compression Ratio on VMware with MS SQL Server”, those data samples with a high “Rel. Compressed Size” value (purple bars) or a low “Compression Ratio” value (red line) are of particular interest in this case. Selecting those systems from the above results, with an – arbitrarily choosen – value of well below 2.00 for the “Compression Ratio” metric, gives us the following list of systems to take a closer look at:
Sample | Datastore / VDisk | VM # | VM Provisioned (GB) | VM Used (GB) | SVC Delta Used (GB) | Compression Ratio | Comment |
---|---|---|---|---|---|---|---|
1 | san-ssb-sql-oben-35C sas-4096G-35C | 3 | 94.18 | 49.79 | 27.43 | 1.82 | SQLIO benchmark file |
2 | 5 | 456.11 | 391.14 | 245.27 | 1.59 | SharePoint with file uploads | |
3 | 6 | 238.14 | 156.29 | 113.15 | 1.38 | HR Applicant Management | |
4 | 7 | 488.11 | 304.32 | 209.16 | 1.45 | SAP and MSSQL (with SQL Page Compression) on the same VM | |
5 | 10 | 292.12 | 104.31 | 59.46 | 1.75 | SAP and MSSQL (with SQL Page Compression) on the same VM | |
6 | san-ssb-sql-unten-360 sas-4096G-360 | 4 | 195.21 | 154.10 | 85.48 | 1.80 | SQLIO benchmark file |
7 | 5 | 34.11 | 23.91 | 13.14 | 1.82 | ||
8 | 9 | 228.11 | 188.26 | 135.48 | 1.39 | HR Applicant Management | |
9 | san-ssb-sql-unten-361 sas-2048G-361 | 7 | 34.11 | 24.16 | 12.52 | 1.93 | |
10 | 8 | 696.32 | 289.09 | 155.07 | 1.86 | SAP and MSSQL (with SQL Page Compression) on the same VM | |
11 | san-ssb-sql-oben-362 sas-2048G-362 | 1 | 218.09 | 37.19 | 19.93 | 1.87 | |
12 | 4 | 143.09 | 37.98 | 19.43 | 1.95 |
The reasons for why those particular systems are exhibiting a subaverage compression ratio could be manifold. Together with our DBAs we went over the list and came up with the following, not exhaustive nor exclusive list of explainations which are already hinted in the “Comment” columns of the above table:
Samples 1 and 6: Those systems were recently used to test the influence of 4k vs. 64k NTFS cluster size on the database I/O performance. Over the course of these tests, a number of dummy files were written with the SQLIO benchmark tool. Altough the dummy files were deleted after the tests were concluded, the storage blocks remained allocated with apparently less compressible contents.
Samples 2, 3 and 8: Those systems host the databases for SharePoint (sample #2) and a HR/HCM applicant management and employee training management system (samples #3 and #8). Both applications seem to be designed or configured to store files uploaded to the application as binary blobs into the database. Both usecases suggest that the uploaded file data is to some extent already in some kind of compressed format (e.g. xlsx, pptx, jpeg, png, mpeg, etc.), thus limiting the efficiency of any further compression attempt like RTC. Storing large amounts of unstructured data together with structured data in a database is subject of frequent, ongoing and probably never ending, controversial discussions. There are several points of view to this question and good arguments are being made from either side. Personally and from a purely operational point of view, i'd favour the unstructured binary data not to be stored in a structured database.
Sample 4: This system hosts the legacy design described above, where a local application instance of SAP is installed along Microsoft SQL Server on the same system. The database tables of this SAP release already use the SQL Server page compression feature. Although this could have very well been the cause for the subaverage compression ratio, a look into the OS offered a different plausible explanation. Inside the OS the sum of space allocated to the various filesystems is only approximately 202 GB, while at same time the amount of space allocated by VMware is a little more than 304 GB. It seems that at one point in time there was a lot more data of unknown compressibility present on the system. The data has since been deleted, but the previously allocated storage blocks have not been properly reclaimed.
In order to put this theory to test, the procedure described in VMware KB 2004155 to reclaim the unused, but not de-allocated storage blocks, was applied. The following table shows the storage allocation for the system in sample #4 before and after the reclaim procedure:
Sample Datastore / VDisk VM # VM Provisioned (GB) VM Used (GB) SVC Delta Used (GB) Compression Ratio Comment 4 san-rtc-ssb-sql-unten-366 sas-1024G-366 7 488.11 304.32 209.16 1.45 SAP and MSSQL (with SQL Page Compression) on the same VM - before reclaiming unused storage space 4 7 488.11 172.53 81.29 2.12 SAP and MSSQL (with SQL Page Compression) on the same VM - after reclaiming unused storage space The numbers show that the above assumption was correct. After zeroing unused disk space within the VM and reclaiming unused storage blocks by re-thin-provisioning the virtual disks in VMware, the RTC compression ratio rose from a meager 1.45 to a near average 2.12. Now, only 47.11% – instead of the previous 68.73% – of the storage space allocated by VMware is actually used on the SVC level. Again, taking VMware thin-provisioning into account, the relative amount of actual storage space needed sinks even further down to 16.65%, or in other words a reduction by a factor of 6.00.
Samples 5 and 10: These systems basically share the same circumstances as the previously examined system in sample #4. The only difference is, that there is still some old, probably unused data sitting in the filesystems. In case of sample #5 it's about 20 GB of compressed SAP installation media and in case of sample #10 it's about 84.3 GB of partially compressed SAP installation media and SAP SUM update files. Of course we'd like to also reclaim this storage space and – hopefully – in the course of this some deleted, but up to now not de-allocated storage blocks, too. By this we hope to see equally good end results as in the case of sample #4. We're currently waiting on clearance from our SAP team to dispose of some of the old and probably unused data.
Samples 7, 9, 11 and 12: Systems in this sample category are probably best described as just being “small”. To better illustrate what is meant by this, the following table shows a compiled view of these four samples. The data was taken from the detailed tabular representation of each datastore on a per VM basis that is shown above:
Sample Datastore / VDisk VM # VM Provisioned (GB) VM Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio Windows Version 7 san-ssb-sql-unten-360 sas-4096G-360 5 34.11 23.91 13.14 54.96 45.04 1.82 2003 9 san-ssb-sql-unten-361 sas-2048G-361 7 34.11 24.16 12.52 51.82 48.18 1.93 2003 11 san-ssb-sql-oben-362 sas-2048G-362 1 218.09 37.19 19.93 53.59 46.41 1.87 2012 R2 12 4 143.09 37.98 19.43 51.16 48.84 1.95 2012 R2 Inside the systems there is very little user or installation data besides the base Windows operating system, the SQL Server binaries and some standard management tools which are used in our environment. VMware thin-provisioning already does a significant amount of space reduction. It reduces the storage space that would be used on the SVC level to approximately 24 GB for Windows 2003 and approximately 37.5 GB for Windows 2012 R2 systems. Although the number of samples is rather low, the almost consistent values within each Windows release category suggest that these values represent – in our environment – the minimal amount of storage space needed for each Windows release. The compressed size of approximately 12.8 GB for Windows 2003 and approximately 19.7 GB for Windows 2012 R2 systems as well as the average compression ratio of approximately 1.89 seem to support this theory in the way that they show very similar values within each Windows release category. It appears that – in our environment – these values are the bottom line with regard to minimal allocated storage capacity. There simply seems not to be enough other data with a good compressability in those systems in order to achieve better results with regard to the overall compression ratio.
By and large, we're quite pleased with the results we're seeing from the SVCs real-time compression feature in our VMware and MS SQL server environment. The amount of storage space saved is – especially with VMware thin-provisioning taken into account – significant. Even in those cases where the SQL servers page compression feature is heavily used, we're still seeing quite good compression results with the use of the additional real-time compression at the SVC level. In part, this is very likely due to the fact that real-time compression at the SVC level also covers the VMs data that is outside of the scope which is covered by the SQL servers page compression. On the other hand, this does not entirely suffice to explain the amount of saved storage space – between 50.86% and 73.86%, if we disregard some special cases for which the reasons of a low compression ratio were discussed above – we're seeing in cases where SQL server page compression is used. From the data collected and shown above, it would appear that the algorithms used in SQL server page compression still leave enough redundant data of low entropy for the algorithms used in the RACE at the SVC level to perform rather well.
In general and with regard to performance in particular we have – up to now – not noticed any negative side effects, like e.g. noticeable increases in I/O latency, by using RTC for the various SQL server and SAP systems shown above.
The use of RTC not only promises, but quite matter of factly offers a significant reduction of the storage space needed. Even with seemingly already compressed workloads, like e.g. SQL server page compressed databases, it deals very well. It thus enables a delayed procurement of additional storage resources, lower operational costs (administration, rack space, power, cooling, etc.), a reduced amount of I/O to the backend storage systems and a more efficient use of tier-1 storage like e.g. flash based storage systems.
It is on the other hand also no out-of-the-box, fire-and-forget solution when used sensibly. The selected examples of “interesting” systems, which were shown and discussed above, illustrates very well that there are always usecases which require special attention. They also point out the increased necessity of a good overall, interdisciplinary technical knowledge or a very good and open communication between the organisational units responsible for application, database, operating system, virtualization and storage administration.
Due to the rather old release level of our VMware environment we unfortunately weren't able to cover the interaction of TRIM, UNMAP, VAAI and RTC. It'd be very interesting to see if and how well those storage block reclaimation technologies work together with the SVCs real-time compression feature.
Comments and own experiences are – as always – very welcome!
2015-02-03 // IBM Storwize V3700 out of Memory
Under certain conditions it is possible to inadvertently run into an out of memory situation on IBM Storwize V3700 systems, by simply running a Download Support Package procedure or the respective CLI command. This will – of course – bring all I/O on the affected system to a grinding halt.
A few days ago, the Nagios monitoring plugin introduced in “Nagios Monitoring - IBM SVC and Storwize” reported a failed PSU on one of our IBM Storwize V3700 systems. After raising a PMR with IBM in order to get the seemingly defective PSU replaced, i was told to simply reseat the PSU. According to IBM support this would usually fix this – apparently known – issue.
This was the first WTF moment and it turned out to not be the last one. So either IBM produces and sells subpar components – in this case the PSU – which need to be given a boot – yes, PSUs nowadays have their own firmware too – in order to be persuaded to cooperate again. Or it means IBM produces and sells subpar software which is not at all able to properly detect a component failure and distinguish between a faulty and a good PSU. Or perhaps its an unfortunate combination of both.
In any case, the procdure to reseat the PSU was carried out, which fixed the PSU issue. During the course of the fix procedure the system would become unreachable via TCP/IP for a rather long time, though. Definately over two minutes, but i haven't had a chance for an exact measurement. After the system was reachable again i followed this strange behaviour up and had a look at the systems event log. There were quite a lot of “Error Code: 1370, Error Code Text: SCSI ERP occurred” messages, so i decided to bother the IBM support again and send them a support collection in order to get an analysis with regard to the reachablilty issue as well as the 1370 errors.
From previous occasions i knew that the IBM support would most likely request a support collection which was run with the “svc_livedump
” CLI command or with the “Standard logs plus new statesaves” option from the WebUI. The latter one is marked red in the following screenshot example:
So i decided to pull a support collection with this option. After some time into the support collection process, the SVC sitting in front of the V3700 and other storage systems, started to show very high latencies (~60 sec.) on the primary VDisks backed by the V3700. On other VDisks which “only” had their secondary VDisk-Mirror located on MDiskGroups of the V3700, the latency peak was less dramatic, but still very noticable. Eventually degraded paths to the MDisks located on the V3700 started showing up on the SVC. After the support collection process finished the situation went back to normal. The latency on both primary and secondary VDisks instantly dropped down to the usual values and after running the fix procedures on the SVC, the degraded paths came back online.
The performance issues were in magnitude and duration severe enough to affect several applications pretty badly. Although the immediate issue was resolved, i still needed an analysis and written statement from IBM support for an action plan on how to prevent this kind of situation in the future and for compliance reasons as well. Here is the digest of what – according to IBM – happened:
During the procedure to reseat the reportedly defective PSU, one or more power surges occured.
These power surges apparently caused issues on the internal disk buses, which lead to the 1370 errors to be logged.
The power surges or the resulting 1370 errors are probably the cause for a failover of the config node too. Hence the connectivity issues with the CLI and the WebUI via TCP/IP.
More 1370 errors were logged during the runtime of the subsequent “Standard logs plus new statesaves” support collection process.
The issue of very high latency and MDisk paths becoming degraded was caused by the “Standard logs plus new statesaves” support collection process using up all of the memory – yes, including the data cache – on the V3700 system. This behaviour is specific and limited to the V3700 systems with “only” 4GB of memory.
As one can imagine, the last item was my second WTF moment. Apparently there are no programmatical safeguards to prevent the support collection process at a “Standard logs plus new statesaves” level from using up all of the systems memory. This would normally not be that bad at all, if the “Standard logs plus new statesaves” wasn't the particular level which IBM support would usually request on support cases concerning SVC and Storwize systems. On the phone the IBM support technician admitted that this common practice is in general probably a bad idea. But he also mentioned that he up to now hadn't heard of the known side effects actually occuring, like in this case.
The suggestion on how to prevent this kind of situation in the future was to either upgrade the V3700 systems from 4GB to 8GB memory – a solution i would gladly take provided it came free of charge – or to only run the support collection process only with the “svc_snap
” CLI command or the “Standard logs” option from the WebUI. Since the memory upgrade for free isn't likely to happen, i'll stick with the second suggestion for now.
Incidently a third option came up over the last weekend. Looking at the IBM System Storage SAN Volume Controller V7.3.0.9 Release Note, it could be construed that someone at IBM SVC and Storwize development came to the realization that the issue could also be addressed in software, by altering the resource utilization of the support collection process:
HU00636 Livedump prepare fails on V3500 & V3700 systems with 4GB memory when cache partition fullness is less than 35%
Fingers crossed, this fix really addresses and resolves the issue described above.
2014-12-24 // Nagios Monitoring - IBM SVC and Storwize (Update)
After upgrading our test IBM SAN Volume Controller (SVC) systems from version 7.3.0.7 to 7.3.0.8 or later, the previously described Nagios monitoring plugin (Nagios Monitoring - IBM SVC and Storwize) ceased to work. A quick check revealed that the “wbemcli
” command line tool from the Standards Based Linux Instrumentation project, which is used in the Nagios plugin to query the CIMOM server on the SVC or Storwize systems, would fail with the following error message:
$ /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm * * /opt/sblim-wbemcli/bin/wbemcli: Http Exception: SSL connect error *
Re-checking the release notes, this sudden change in behaviour seemed to be explained by the fix:
SSL vulnerability CVE-2014-3566
Not really being that verbose a description, a quick look at the CVE-2014-3566 showed that this is a fix for the “POODLE” issue. So IBM probably switched off the support for the SSLv3 protocol in the SVC and Storwize code. But why would this cause the “wbemcli
” command line tool to fail? Here are the steps taken in an analysis of the issue:
First, i was trying to get the “
wbemcli
” command line tool to be a tad more verbose about what it is actually doing:$ /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm To server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"><SIMPLEREQ><IMETHODCALL NAME="EnumerateClassNames"><LOCALNAMESPACEPATH><NAMESPACE NAME="root"></NAMESPACE><NAMESPACE NAME="ibm"></NAMESPACE></LOCALNAMESPACEPATH> <IPARAMVALUE NAME="DeepInheritance"><VALUE>TRUE</VALUE></IPARAMVALUE> </IMETHODCALL></SIMPLEREQ> </MESSAGE></CIM> * * /opt/sblim-wbemcli/bin/wbemcli: Http Exception: SSL connect error *
Not really an abundance of information in here too.
Getting the source code for and, while we're at it, updating the “
wbemcli
” command line tool from version 1.6.0 to 1.6.3. Spent some time looking through the source code and with the “gdb
” debugger to get a feeling for the general program flow and functions/methods being called. While looking through the source code i noticed the cURL debugging options are being set if a environment variable named “CURLDEBUG
” is set to “true
”. Later also found this mentioned in theChangeLog
file:$ CURLDEBUG=true /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm To server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"><SIMPLEREQ><IMETHODCALL NAME="EnumerateClassNames"><LOCALNAMESPACEPATH><NAMESPACE NAME="root"></NAMESPACE><NAMESPACE NAME="ibm"></NAMESPACE></LOCALNAMESPACEPATH> <IPARAMVALUE NAME="DeepInheritance"><VALUE>TRUE</VALUE></IPARAMVALUE> </IMETHODCALL></SIMPLEREQ> </MESSAGE></CIM> * About to connect() to svc-test port 5989 (#0) * Trying 192.168.x.x... * connected * Connected to svc-test (192.168.x.x) port 5989 (#0) * successfully set certificate verify locations: * CAfile: none CApath: /etc/ssl/certs * Unknown SSL protocol error in connection to svc-test:5989 * Closing connection #0 * SSL connect error * * /opt/sblim-wbemcli/bin/wbemcli: Http Exception: SSL connect error *
Now we know that we're encountering a “
Unknown SSL protocol error
” – not that helpful either.Searched the web for further information on how to debug the cURL library and found the debug.c example source code, which was very helpful. Incorperated it into the file “
CimCurl.cpp
”:- sblim-wbemcli-1.6.3_debug.patch
--- sblim-wbemcli-1.6.3_orig/CimCurl.cpp 2013-09-21 01:26:32.000000000 +0200 +++ sblim-wbemcli-1.6.3_new/CimCurl.cpp 2014-11-26 16:30:19.000000000 +0100 @@ -37,6 +37,100 @@ extern int waitTime; extern int expect100; +// Trace Begin +struct data { + char trace_ascii; /* 1 or 0 */ +}; + +static +void dump(const char *text, + FILE *stream, unsigned char *ptr, size_t size, + char nohex) +{ + size_t i; + size_t c; + + unsigned int width=0x10; + + if(nohex) + /* without the hex output, we can fit more on screen */ + width = 0x40; + + fprintf(stream, "%s, %010.10ld bytes (0x%08.8lx)\n", + text, (long)size, (long)size); + + for(i=0; i<size; i+= width) { + + fprintf(stream, "%04.4lx: ", (long)i); + + if(!nohex) { + /* hex not disabled, show it */ + for(c = 0; c < width; c++) + if(i+c < size) + fprintf(stream, "%02x ", ptr[i+c]); + else + fputs(" ", stream); + } + + for(c = 0; (c < width) && (i+c < size); c++) { + /* check for 0D0A; if found, skip past and start a new line of output */ + if (nohex && (i+c+1 < size) && ptr[i+c]==0x0D && ptr[i+c+1]==0x0A) { + i+=(c+2-width); + break; + } + fprintf(stream, "%c", + (ptr[i+c]>=0x20) && (ptr[i+c]<0x80)?ptr[i+c]:'.'); + /* check again for 0D0A, to avoid an extra \n if it's at width */ + if (nohex && (i+c+2 < size) && ptr[i+c+1]==0x0D && ptr[i+c+2]==0x0A) { + i+=(c+3-width); + break; + } + } + fputc('\n', stream); /* newline */ + } + fflush(stream); +} + +static +int my_trace(CURL *handle, curl_infotype type, + char *data, size_t size, + void *userp) +{ + struct data *config = (struct data *)userp; + const char *text; + (void)handle; /* prevent compiler warning */ + + switch (type) { + case CURLINFO_TEXT: + fprintf(stderr, "== Info: %s", data); + default: /* in case a new one is introduced to shock us */ + return 0; + + case CURLINFO_HEADER_OUT: + text = "=> Send header"; + break; + case CURLINFO_DATA_OUT: + text = "=> Send data"; + break; + case CURLINFO_SSL_DATA_OUT: + text = "=> Send SSL data"; + break; + case CURLINFO_HEADER_IN: + text = "<= Recv header"; + break; + case CURLINFO_DATA_IN: + text = "<= Recv data"; + break; + case CURLINFO_SSL_DATA_IN: + text = "<= Recv SSL data"; + break; + } + + dump(text, stderr, (unsigned char *)data, size, config->trace_ascii); + return 0; +} +// Trace End + // These are the constant headers added to all requests static const char *headers[] = { "Content-Type: application/xml; charset=\"utf-8\"", @@ -152,6 +246,11 @@ CURLcode rv; string sb; +// Trace Begin +struct data config; +config.trace_ascii = 1; +// Trace End + mUri = url.scheme + "://" + url.host + ":" + url.port + "/cimom"; url.ns.toStringBuffer(sb,"%2F"); @@ -248,6 +347,11 @@ rv = curl_easy_setopt(mHandle, CURLOPT_WRITEHEADER, &mErrorData); rv = curl_easy_setopt(mHandle, CURLOPT_HEADERFUNCTION, headerCb); + +// Trace Begin + rv = curl_easy_setopt(mHandle, CURLOPT_DEBUGFUNCTION, my_trace); + rv = curl_easy_setopt(mHandle, CURLOPT_DEBUGDATA, &config); +// Trace End } static string getErrorMessage(CURLcode err)
rebuild the “
wbemcli
” command line tool and tried again:$ CURLDEBUG=true /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm To server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"><SIMPLEREQ><IMETHODCALL NAME="EnumerateClassNames"><LOCALNAMESPACEPATH><NAMESPACE NAME="root"></NAMESPACE><NAMESPACE NAME="ibm"></NAMESPACE></LOCALNAMESPACEPATH> <IPARAMVALUE NAME="DeepInheritance"><VALUE>TRUE</VALUE></IPARAMVALUE> </IMETHODCALL></SIMPLEREQ> </MESSAGE></CIM> == Info: About to connect() to svc-test port 5989 (#0) == Info: Trying 192.168.x.x... == Info: connected == Info: Connected to svc-test (192.168.x.x) port 5989 (#0) == Info: found 172 certificates in /etc/ssl/certs/ca-certificates.crt == Info: gnutls_handshake() failed: A TLS packet with unexpected length was received. == Info: Closing connection #0 == Info: SSL connect error * * /opt/sblim-wbemcli/bin/wbemcli: Http Exception: SSL connect error *
Now we know there is an issue in the way the GnuTLS library used by libcURL interacts with the CIMOM server on the SVC or Storwize systems.
Again, searching the web for similar issues with the error message “
gnutls_handshake() failed: A TLS packet with unexpected length was received.
”, we find a rebuild against the OpenSSL library instead of the GnuTLS library could solve this issue:$ dpkg -l | grep curl ii curl 7.26.0-1+wheezy11 powerpc command line tool for transferring data with URL syntax ii libcurl3:powerpc 7.26.0-1+wheezy11 powerpc easy-to-use client-side URL transfer library (OpenSSL flavour) ii libcurl3-gnutls:powerpc 7.26.0-1+wheezy11 powerpc easy-to-use client-side URL transfer library (GnuTLS flavour) ii libcurl4-gnutls-dev 7.26.0-1+wheezy11 powerpc development files and documentation for libcurl (GnuTLS flavour) $ apt-get install libcurl4-openssl-dev Reading package lists... Done Building dependency tree Reading state information... Done Suggested packages: libcurl3-dbg The following packages will be REMOVED: libcurl4-gnutls-dev The following NEW packages will be installed: libcurl4-openssl-dev 0 upgraded, 1 newly installed, 1 to remove and 0 not upgraded. Need to get 0 B/1,259 kB of archives. After this operation, 28.7 kB of additional disk space will be used. Do you want to continue [Y/n]? y (Reading database ... 65717 files and directories currently installed.) Removing libcurl4-gnutls-dev ... Processing triggers for man-db ... Selecting previously unselected package libcurl4-openssl-dev. (Reading database ... 65463 files and directories currently installed.) Unpacking libcurl4-openssl-dev (from .../libcurl4-openssl-dev_7.26.0-1+wheezy11_powerpc.deb) ... Processing triggers for man-db ... Setting up libcurl4-openssl-dev (7.26.0-1+wheezy11) ... $ dpkg -l | grep curl ii curl 7.26.0-1+wheezy11 powerpc command line tool for transferring data with URL syntax ii libcurl3:powerpc 7.26.0-1+wheezy11 powerpc easy-to-use client-side URL transfer library (OpenSSL flavour) ii libcurl3-gnutls:powerpc 7.26.0-1+wheezy11 powerpc easy-to-use client-side URL transfer library (GnuTLS flavour) ii libcurl4-openssl-dev 7.26.0-1+wheezy11 powerpc development files and documentation for libcurl (OpenSSL flavour)
Rebuild the “
wbemcli
” command line tool and tried again:$ CURLDEBUG=true /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm To server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"><SIMPLEREQ><IMETHODCALL NAME="EnumerateClassNames"><LOCALNAMESPACEPATH><NAMESPACE NAME="root"></NAMESPACE><NAMESPACE NAME="ibm"></NAMESPACE></LOCALNAMESPACEPATH> <IPARAMVALUE NAME="DeepInheritance"><VALUE>TRUE</VALUE></IPARAMVALUE> </IMETHODCALL></SIMPLEREQ> </MESSAGE></CIM> == Info: About to connect() to svc-test port 5989 (#0) == Info: Trying 192.168.x.x... == Info: connected == Info: Connected to svc-test (192.168.x.x) port 5989 (#0) == Info: successfully set certificate verify locations: == Info: CAfile: none CApath: /etc/ssl/certs == Info: SSLv3, TLS handshake, Client hello (1): => Send SSL data, 0000000134 bytes (0x00000086) 0000: ......T.."\.~....`...K4.......p..7"&4...Z.....9.8.........5..... 0040: ................3.2.....E.D...../...A........................... 0080: ...... == Info: Unknown SSL protocol error in connection to svc-test:5989 == Info: Closing connection #0 == Info: SSL connect error * * /opt/sblim-wbemcli/bin/wbemcli: Http Exception: SSL connect error *
Now we know that the “
wbemcli
” command line tool is actually – as already suspected – trying to initiate a SSLv3 connection.In order to confirm we're on the right track, try to first verify manually that we're unable to connct with a SSLv3 secured connection:
$ openssl s_client -host svc-test -port 5989 -ssl3 CONNECTED(00000003) write:errno=104 --- no peer certificate available --- No client certificate CA names sent --- SSL handshake has read 0 bytes and written 0 bytes --- New, (NONE), Cipher is (NONE) Secure Renegotiation IS NOT supported Compression: NONE Expansion: NONE SSL-Session: Protocol : SSLv3 Cipher : 0000 Session-ID: Session-ID-ctx: Master-Key: Key-Arg : None PSK identity: None PSK identity hint: None SRP username: None Start Time: 1418397594 Timeout : 7200 (sec) Verify return code: 0 (ok) --- quit
And that we're instead able to connect with a TLS secured connection:
$ openssl s_client -host svc-test -port 5989 CONNECTED(00000003) depth=0 C = GB, L = Hursley, O = IBM, OU = SSG, CN = 2145, emailAddress = support@ibm.com verify error:num=18:self signed certificate verify return:1 depth=0 C = GB, L = Hursley, O = IBM, OU = SSG, CN = 2145, emailAddress = support@ibm.com verify return:1 --- Certificate chain 0 s:/C=GB/L=Hursley/O=IBM/OU=SSG/CN=2145/emailAddress=support@ibm.com i:/C=GB/L=Hursley/O=IBM/OU=SSG/CN=2145/emailAddress=support@ibm.com --- Server certificate -----BEGIN CERTIFICATE----- MIICyDCCAjGgAwIBAgIEUAPzmTANBgkqhkiG9w0BAQUFADBqMQswCQYDVQQGEwJH QjEQMA4GA1UEBxMHSHVyc2xleTEMMAoGA1UEChMDSUJNMQwwCgYDVQQLEwNTU0cx DTALBgNVBAMTBDIxNDUxHjAcBgkqhkiG9w0BCQEWD3N1cHBvcnRAaWJtLmNvbTAe Fw0xMjA3MTYxMDU3MjlaFw0yNzA3MTMxMDU3MjlaMGoxCzAJBgNVBAYTAkdCMRAw DgYDVQQHEwdIdXJzbGV5MQwwCgYDVQQKEwNJQk0xDDAKBgNVBAsTA1NTRzENMAsG A1UEAxMEMjE0NTEeMBwGCSqGSIb3DQEJARYPc3VwcG9ydEBpYm0uY29tMIGfMA0G CSqGSIb3DQEBAQUAA4GNADCBiQKBgQC3E7+7mE2GAID/35o5/s7cnzoqu9PQdOGB ryGMa8adD4Wd9hpmTkrsgyNvkUB6sPIifbFstGooOkQtK9ZNgP5OHOorZmqINSxM 9goCkSCQG9xRKAvNt2tA8gujaV+p42oVEhIH6naJUul96qZI31y3GffUu2CRrJL7 4wG/8cv0BQIDAQABo3sweTAJBgNVHRMEAjAAMCwGCWCGSAGG+EIBDQQfFh1PcGVu U1NMIEdlbmVyYXRlZCBDZXJ0aWZpY2F0ZTAdBgNVHQ4EFgQUkPMkXUjn0YHlfQW8 TJiRC5jWQO4wHwYDVR0jBBgwFoAUkPMkXUjn0YHlfQW8TJiRC5jWQO4wDQYJKoZI hvcNAQEFBQADgYEAKqu7KpVxnOXonQE3unC1O7qUHKoyQUEWqcKsM/4tPI+lsBMZ jvoPwn8yQRWiLehFmVc8VSZfdFPLzshNabXp5qbZo/EFberXrgI2CbtPiULYyyyH DUhWF+vhwb6uqwfBbGncvTvI2ewU8+0oTXsuTkSjumJ7+chpaHFWWyj2cJA= -----END CERTIFICATE----- subject=/C=GB/L=Hursley/O=IBM/OU=SSG/CN=2145/emailAddress=support@ibm.com issuer=/C=GB/L=Hursley/O=IBM/OU=SSG/CN=2145/emailAddress=support@ibm.com --- No client certificate CA names sent --- SSL handshake has read 1029 bytes and written 498 bytes --- New, TLSv1/SSLv3, Cipher is AES256-GCM-SHA384 Server public key is 1024 bit Secure Renegotiation IS supported Compression: NONE Expansion: NONE SSL-Session: Protocol : TLSv1.2 Cipher : AES256-GCM-SHA384 Session-ID: 48291D368E0A8584A8DFA00A9881B8979BDE370FC6C9439294C670695D031239 Session-ID-ctx: Master-Key: 71BD12A161FC595CD056DA8E6D6E27420F37468E47498B7591A403A86844C55F61FF02B2FEC7739FAAEDCE3DFEA0F217 Key-Arg : None PSK identity: None PSK identity hint: None SRP username: None TLS session ticket lifetime hint: 300 (seconds) TLS session ticket: 0000 - a0 e0 b1 9b c2 37 9a ca-49 1c 54 f5 26 4b d6 24 .....7..I.T.&K.$ 0010 - af 6a 7d cc 5e 4a 97 a8-b3 6d b7 66 0b b7 0a 65 .j}.^J...m.f...e 0020 - 47 af ef 47 76 fc c7 e9-38 ff 84 28 ca 8e 73 25 G..Gv...8..(..s% 0030 - 47 25 f6 0d 36 01 04 f1-f9 f7 0c b6 42 ef cf 09 G%..6.......B... 0040 - 64 8f df ff 89 38 ed 7c-ae 1d 0e 25 d1 c1 77 86 d....8.|...%..w. 0050 - b6 61 88 15 cf fe 9f 20-86 0d 17 74 18 da ea c0 .a..... ...t.... 0060 - 33 3a 47 f5 f9 51 24 ae-48 37 8a 3f 19 dd c6 04 3:G..Q$.H7.?.... 0070 - 7e d1 20 78 35 99 0b 9f-3b 1f ce 7c bc 11 93 e4 ~. x5...;..|.... 0080 - 0f 94 de 94 f1 0d 0c da-64 ca 0d f6 10 2a c8 fa ........d....*.. 0090 - dc 3e e4 1a 97 d1 34 7a-9c f5 c3 00 e8 1b 10 d7 .>....4z........ Start Time: 1418397614 Timeout : 300 (sec) Verify return code: 18 (self signed certificate) --- quit
The secured connection negotiated to TLS succeeded, so we're on the right track!
Now we need to find the spot in the source code, where the “
wbemcli
” command line tool is forced to initiate a SSLv3 connection. We know this is probably done in a cURL related function call, since libcURL is used for the network connection. So lets first look into the cURL code for all the lines showing any sign of SSL related operations:$ grep -n curl CimCurl.cpp | grep -i ssl 185: // Assume we support SSL if we don't have the curl_version_info API 276: rv = curl_easy_setopt(mHandle, CURLOPT_SSL_VERIFYHOST, 0); 277: // rv = curl_easy_setopt(mHandle, CURLOPT_SSL_VERIFYPEER, 0); 280: rv = curl_easy_setopt(mHandle, CURLOPT_SSLVERSION, 3); 441: if ((rv=curl_easy_setopt(mHandle,CURLOPT_SSL_VERIFYPEER,0))) { 448: if ((rv=curl_easy_setopt(mHandle,CURLOPT_SSL_VERIFYPEER,1))) { 466: if ((rv=curl_easy_setopt(mHandle,CURLOPT_SSLCERT,certificate))) { 470: if ((rv=curl_easy_setopt(mHandle,CURLOPT_SSLKEY,key))) {
Looks like a perfect match in line 280 of the file “
CimCurl.cpp
”. The line numbers are a bit off from the original source code, since the file “CimCurl.cpp
” was patched with our above debugging code. The code in the original, unpatched source file “CimCurl.cpp
”, within the function “CimomCurl::genRequest
” looks like this:- CimCurl.cpp
175 [...] 176 /* Disable SSL host verification */ 177 rv = curl_easy_setopt(mHandle, CURLOPT_SSL_VERIFYHOST, 0); 178 // rv = curl_easy_setopt(mHandle, CURLOPT_SSL_VERIFYPEER, 0); 179 180 /* Force using SSL V3 */ 181 rv = curl_easy_setopt(mHandle, CURLOPT_SSLVERSION, 3); 182 183 /* Set username and password */ 184 if (url.user.length() > 0 && url.password.length() > 0) { 185 mUserPass = url.user + ":" + url.password; 186 rv = curl_easy_setopt(mHandle, CURLOPT_USERPWD, mUserPass.c_str()); 187 } 188 [...]
In line 181 the cURL option “
CURLOPT_SSLVERSION
” is indiscriminately set to use SSLv3 and nothing else, which we know from the above deduction is bound to fail on systems adressing the “POODLE” issues.With the knowledge where the issue is actually caused, an easy quick'n'dirty fix can be implemented:
- sblim-wbemcli-1.6.3_debug.patch
--- sblim-wbemcli-1.6.3_orig/CimCurl.cpp 2013-09-21 01:26:32.000000000 +0200 +++ sblim-wbemcli-1.6.3/CimCurl.cpp 2014-11-26 16:46:09.000000000 +0100 @@ -178,7 +178,7 @@ // rv = curl_easy_setopt(mHandle, CURLOPT_SSL_VERIFYPEER, 0); /* Force using SSL V3 */ - rv = curl_easy_setopt(mHandle, CURLOPT_SSLVERSION, 3); + //rv = curl_easy_setopt(mHandle, CURLOPT_SSLVERSION, 3); /* Set username and password */ if (url.user.length() > 0 && url.password.length() > 0) {
Inserting the comment at the line where the cURL option “
CURLOPT_SSLVERSION
” is forced to SSLv3 causes libcURL to fall back to its default value, which is now TLS.Rebuild the “wbemcli” command line tool and tried again:
$ CURLDEBUG=true /opt/sblim-wbemcli/bin/wbemcli -dx -noverify ecn https://user:pass@svc-test:5989/root/ibm To server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"><SIMPLEREQ><IMETHODCALL NAME="EnumerateClassNames"><LOCALNAMESPACEPATH><NAMESPACE NAME="root"></NAMESPACE><NAMESPACE NAME="ibm"></NAMESPACE></LOCALNAMESPACEPATH> <IPARAMVALUE NAME="DeepInheritance"><VALUE>TRUE</VALUE></IPARAMVALUE> </IMETHODCALL></SIMPLEREQ> </MESSAGE></CIM> * About to connect() to svc-test port 5989 (#0) * Trying 192.168.x.x... * connected * Connected to svc-test (192.168.x.x) port 5989 (#0) * found 172 certificates in /etc/ssl/certs/ca-certificates.crt * server certificate verification SKIPPED * common name: 2145 (does not match 'svc-test') * server certificate expiration date OK * server certificate activation date OK * certificate public key: RSA * certificate version: #3 * subject: C=GB,L=Hursley,O=IBM,OU=SSG,CN=2145,EMAIL=support@ibm.com * start date: Mon, 16 Jul 2012 10:57:29 GMT * expire date: Tue, 13 Jul 2027 10:57:29 GMT * issuer: C=GB,L=Hursley,O=IBM,OU=SSG,CN=2145,EMAIL=support@ibm.com * compression: NULL * cipher: AES-128-CBC * MAC: SHA1 * Server auth using Basic with user 'user' > POST /cimom HTTP/1.1 Authorization: Basic enp6bmFnaW9zOm5hZ2lvcw== Host: svc-test:5989 Content-Type: application/xml; charset="utf-8" Connection: Keep-Alive, TE CIMProtocolVersion: 1.0 CIMOperation: MethodCall CIMMethod: EnumerateClassNames CIMObject: root%2Fibm Content-Length: 396 * upload completely sent off: 396 out of 396 bytes * additional stuff not fine transfer.c:1037: 0 0 * HTTP 1.1 or later with persistent connection, pipelining supported < HTTP/1.1 200 OK < Content-Type: application/xml; charset="utf-8" From server: Content-Type: application/xml; charset="utf-8" < content-length: 0000072284 From server: content-length: 0000072284 < CIMOperation: MethodResponse From server: CIMOperation: MethodResponse < * Connection #0 to host svc-test left intact From server: <?xml version="1.0" encoding="utf-8" ?> <CIM CIMVERSION="2.0" DTDVERSION="2.0"> <MESSAGE ID="4711" PROTOCOLVERSION="1.0"> <SIMPLERSP> <IMETHODRESPONSE NAME="EnumerateClassNames"> <IRETURNVALUE> <CLASSNAME NAME="CIM_ConcreteIdentity"/> <CLASSNAME NAME="CIM_NetworkPacketAction"/> <CLASSNAME NAME="CIM_CollectionInSystem"/> [...] $ /opt/sblim-wbemcli/bin/wbemcli -noverify ecn https://user:pass@svc-test:5989/root/ibm svc-test:5989/root/ibm:CIM_ConcreteIdentity svc-test:5989/root/ibm:CIM_NetworkPacketAction svc-test:5989/root/ibm:CIM_CollectionInSystem svc-test:5989/root/ibm:CIM_DeviceSAPImplementation svc-test:5989/root/ibm:CIM_ProtocolControllerAccessesUnit svc-test:5989/root/ibm:CIM_ControlledBy [...]
Great, now we're finally able to query and monitor the IBM SVC or Storwize systems again with the Nagios monitoring plugin (Nagios Monitoring - IBM SVC and Storwize)!
Between first noticing and researching the issue and creating the quick'n'dirty fix shown above, an official bug report has been filed on the issue and a patch has already been submitted to the source code repository in order to adress the issue more thoroughly. Hopefully an updated official source code package will be released soon.
Nonetheless, the process of researching and debugging this issue was an excellent hands on exercise for me, which i enjoyed very much. Hopefully the steps taken and described here, will turn out to be of use for others as well.
2014-11-23 // IBM SVC and experiences with Real-time Compression
We've been using the IBM SAN Volume Controller (SVC) for several years now (since about 2006). While the technological progress was a tad slow in the early years, development – probably along with popularity and more widespread use – of the SVC fortunately picked up considerable pace in recent years. For us the most interesting new features that emerged over time are the:
stretched cluster configuration, which seems to be far more popular in europe with its traditionally shorter distances between datacenters. It allows us to provide active-active access to host LUNs and thus considerably simplifying a resilient VMware environment.
synchronous VDisk mirroring, which allows us to provide mirrored LUNs that are independent of the storage vendor in the backend.
EasyTier, which allows us to utilize our TMS RamSan (now IBM Flash System) flash storage much more efficiently.
Real-time compression (RTC), which as well allows us to utilize our overall backend storage much more efficiently.
We also went through several cycles of node hardware, starting with our initial 2145-4F2 via the 2145-8F2 and the 2145-8G4 to our current 2145-CG8. Currently our overall environment looks like this:
Six 2145-CG8 nodes in three I/O groups.
A stretched cluster configuration covering two datacenters which are within a distance of about 300 meters of single-mode fibre length. A quorum storage on a third location on the same campus.
Dual fabric 8GB FC SAN consisting of four Brocade DCX-4S directors.
A total of six disk based storage systems from different vendors (HDS AMS2300, Fujitsu DX90, IBM V3700), one of each vendor in each datacenter, with a total capacity of about 260TB. The systems are typically sized and configured for multiple 4+1 or 8+1 RAID-5 arrays. The arrays are subsequently mapped entirely as LUNs to the SVC. All the LUNs from one disk storage system are pooled together into one MDiskGroup, forming a single, striped failure group.
A total of six flash based storage systems (TMS RamSan-630, TMS RamSan-810, IBM Flashsystem 820), one of each model in each datacenter, with a total capacity of about 34TB. The systems LUNs are also mapped to the SVC and all the LUNs from one flash storage are added to one disk based MDiskGroup for use as a SSD tier with EasyTier.
65 host systems (IBM xSeries, IBM BladeCenter and IBM Power), running either non-virtualized or virtualized (VMware, Xen, PowerVM) workloads.
291 VDisks, ranging from sizes of 1GB up to 4TB. Almost all VDisks are mirrored, exceptions are made where the application on the host prefers to do the mirroring or the replication itself (e.g. MS Exchange).
For about seven months – since the end of april 2014 – we've now been using Real-time compression (RTC). We started by implementing RPQ 8S1296 on all of our nodes. This added an additional 6-core CPU and 24GB memory to each node. As a result the “normal” SVC operation still uses four of the CPU-cores on the initially installed CPU and the RTC algorithm is dedicated eight CPU-cores – two on the initially and six on the additionally installed CPU. The SVC code version was v7.1.0.3 when we initially started testing and implementing RTC (see the screenshots below). After gradually converting more and more VDisks to compressed volumes, the CPU load of the RTC algorithm became quite noticable (30-60%). Along with that, a substancially increased latency (>10ms) could be observed. After updating the SVC code to v7.2.0.7, the CPU utilization dropped noticably to 10% and below. The latency also went back to the values (<2ms) usually observed before implementing RTC. There must have been quite an improvement in the RTC code, unfortunately IBM does not publish any such interesting details.
Over the course of implementing RTC, i took four series of screenshots from the SVC WebUI showing the compression allocation view at that particular point in time. They have been combined into the four following images showing all information at once, with the last one being a representation of the current situation:
One thing about the visual representation in the compression allocation view and the volume of RTC licensed storage – which confused me too at first – has to be noted. It seems that with regard to RTC licensing only the storage space allocated to primary VDisk volumes i.e. the unmirrored storage capacity is considered. From the information available on the matter at the time, both the corresponding sections from the “IBM EMEA Software Announcement ZP12-0235”:
The license entitles users for the quantity of terabytes of SVC volumes created with real-time compression enabled. The volume size (and not necessarily the amount of data you are able to store on that volume compressed) is the measure used to determine how many terabytes of 5641-CP1 to license.
and from the “Real-time Compression in SAN Volume Controller and Storwize V7000” redpaper:
In SAN Volume Controller, real-time compression is licensed by the total number of terabytes of virtual capacity, which is the amount of storage that the host sees.
are a bit ambiguous. At first the emphasis seems to be on the matter of uncompressed vs. compressed storage size. But on second thought, the key words “[…] the amount of storage that the host sees.” from the second quote come into play. Those basically say, that enabling RTC on secondary VDisk copies is free, since a host only sees the unmirrored amount of storage.
Example: In our setup each VDisk compressed with RTC has also a secondary copy, also compressed with RTC. There is currently only one exception in the form of a flash copy volume of 2750GB which is compressed, but not mirrored. The currently compressed virtual capacity is 62.67TB. The currently used RTC licenses is being reported as 32.68TB. The calculation is: ( 62.67TB - 2750GB ) / 2 + 2750GB = 32.68TB. Makes sense, doesn't it?
For the four cases shown in the images above, the average RTC+TP and RTC-only compression ratio, relative storage size after compression and relative storage saving have been calculated and are shown in the following table:
Sample | RTC plus Thin Provisioning | RTC only | ||||
---|---|---|---|---|---|---|
Ratio | Relative Size | Relative Saving | Ratio | Relative Size | Relative Saving | |
1 | 1:3.75 | 26.69% | 73.31% | 1:3.06 | 32.65% | 67.35% |
2 | 1:3.63 | 27.51% | 72.49% | 1:3.02 | 33.07% | 66.92% |
3 | 1:3.5 | 28.58% | 71.42% | 1:2.80 | 35.65% | 64.35% |
4 | 1:3.2 | 31.25% | 68.75% | 1:2.60 | 38.53% | 61.47% |
Although thin provisioning (TP) alone would amount to 16-20% storage saving on the overall average, the larger and quiet substancial effect of 64-67% of storage saving on the overall average can only be achieved with RTC. Needless to say, we're currently only using RTC on VDisks that promise to yield rather good compression results. This is done in order to save valueable licensed compression volume and to minimize the number of CPU cycles wasted on uncompressible data. For guidelines please see the recommendations in the redbook referenced below in the “Links & Resources” section. The VDisks have either been chosen by estimating the compressability with the comprestimator utility or by knowing the application data very well. VDisks which contain data with an inherent high entropy, like e.g. encrypted and compressed MS Exchange DAGs, repositories of already compressed installation media, compressed Windows system images or compressed AIX mksysb images, page compressed MS SQL data files or fileserver volumes with user data (docx, xlsx, pptx, zip, jpeg, mpeg, etc.) have been excluded from RTC for now. We're still investigating which way to ultimately go on the topic of MS SQL page compression vs. IBM SVC RTC. We're also still sorting out the “mess” of compressible and uncompressible data being mixed together within the same VMware datastores before converting some of them into RTC enabled VDisks. We'll probably end up with a mix of RTC enabled VDisks backing VMware datastores containing compression-friendly VMs and regular VDisks backing datastores containing high-entropy, less compressible VMs. It'll be interesting to see what the ratio between the two will be and which VM will fall into which category.
A detailed compilation of storage saving with RTC on a per VDisk basis is shown in the section Detailed per VDisk RTC data. The columns are pretty self-explainatory, with the most interesting columns being “Application Type”, “Size (%)” and “Saving (%)”. To get a better overview, a condensed view aggregated by “Application Type” was compliled. The following table shows the values aggregated by “Application Type”. The columns “Avg. Size (GB)” and “Avg. Compressed Size (GB)” show the arithmetic average for the VDisk size and compressed size. To be able to compare the different application types against each other, relative arithmetic averages were calculated (columns “Avg. Size (%)” and “vg. Saving (%)”). The columns “Min./Max. Size/Saving (%)” and “Samples” are provided to get an impression about the variability and the significance of the arithmetic averages.
Application Type | Avg. Size (GB) | Avg. Compressed Size (GB) | Min. Size (%) | Avg. Size (%) | Max. Size (%) | Min. Saving (%) | Avg. Saving (%) | Max. Saving (%) | Samples |
---|---|---|---|---|---|---|---|---|---|
AIX rootvg | 16.61 | 5.53 | 9.38 | 33.29 | 73.44 | 9.38 | 66.71 | 90.62 | 198 |
Apache & MySQL | 19.56 | 10.58 | 17.19 | 54.12 | 84.38 | 15.62 | 45.88 | 82.81 | 18 |
Code Server | 200.00 | 64.50 | 32.25 | 32.25 | 32.25 | 67.75 | 67.75 | 67.75 | 2 |
Fileserver | 350.00 | 142.63 | 40.71 | 40.75 | 40.79 | 59.21 | 59.25 | 59.29 | 2 |
Linux Package | 250.00 | 152.75 | 61.10 | 61.10 | 61.10 | 38.90 | 38.90 | 38.90 | 2 |
MS SQL | 120.00 | 18.50 | 15.42 | 15.42 | 15.42 | 84.58 | 84.58 | 84.58 | 2 |
Database | 346.40 | 86.56 | 5.38 | 24.99 | 59.17 | 40.83 | 75.01 | 94.62 | 80 |
ERP | 426.95 | 141.02 | 21.25 | 33.03 | 48.05 | 51.95 | 66.97 | 78.75 | 59 |
TSM OpCenter | 26.00 | 9.25 | 35.58 | 35.58 | 35.58 | 64.42 | 64.42 | 64.42 | 2 |
VMware ESX | 1756.00 | 798.84 | 28.34 | 45.49 | 53.73 | 46.27 | 54.51 | 71.66 | 8 |
Windows Package | 300.00 | 185.13 | 61.67 | 61.71 | 61.75 | 38.25 | 38.29 | 38.33 | 2 |
Xen Server | 512.00 | 184.64 | 11.87 | 36.06 | 47.75 | 52.25 | 63.94 | 88.13 | 18 |
A graphical representation of the above aggregated data is shown in the following graph:
The values “Avg. Size (%)” and “Avg. Saving (%)” are shown as colored bars, the error bars are their respective relative minimum and maximum values, the numbers at the bottom of the colored bars are the number of samples per application type.
The “AIX rootvg” and “Database” application types show rather good compression results, but also a high range of variability. For the case of “AIX rootvg” the compression results depend mainly on the amount of paging space being used, since on all systems the paging space is part of the rootvg. Systems with very low compressed VDisk size also show no or almost no paging space usage. For the case of “Database” the compression results obviously depend on the data stored within the database tablespaces, unfortunately i have no further information on what kind of data is stored in the less compressible cases. In the most compressible cases (5.38% and 7.32% of the original VDisk size) though, the databases contain large amounts of text data and fulltext search indexes, which explains the extraordinary compression results. The “ERP” application type shows equally good compression results, but with a much lower variability, which seems to be largely caused by the current amount of free unused database space. The sample from the “MS SQL” application type is unfortunately the only VDisk directly allocated to a MS SQL Server instance, so it's not really a representative sample. All other MS SQL Server instance in our environment run on VMware and the VDisks there are currently not RTC enabled due to the application and use case mixture described above. The “Code Server” and “Fileserver” application types show good compression results, considering the data usually associated with those. The low sample number indicated that they have specifically been chosen as candidates for RTC enabled VDisks. Especially the other available fileserver VDisks showed much worse results in the preliminary comprestimator runs. Only the the fileserver VDisk containing the Windows server saved user profiles showed good enough results to justify enabling RTC. The “Linux Package” and “Windows Package” application types show the expected worst compression results, since the data on those VDisks is already stored in compressed formats. Still, compression savings of approximately 40% on this kind of data is still pretty good. The “Apache & MySQL” application type is by its compression behaviour kind of a mixture between the “AIX rootvg” and “Fileserver” application types, since it has the same paging space configuration and has large amounts of already compressed data (bz2, gz, docx, xlsx, pptx, zip, jpeg, mpeg, etc.) stored on the VDisks. The “VMware ESX” and “Xen Server” also show average compression results, considering that those are run by 3rd parties where there is currently no distinction between and seperation of compression-friendly from compression-unfriendly VM data being made.
Overall, we're up to now quite happy with the SVCs RTC feature, since it gives us some breathing room and on the overall takes off the pressure to evaluate, purchase, implement and maintain additional storage units for the foreseeable future. On the downside, there is a pretty hefty price tag to the RTC volume licenses. Especially compared to other vendors (e.g. Pure Storage) where the compression functionality is part of the systems base package. One is advised to do a thorough TCO calculation to determine whether the investment in SVC RTC licenses outways the benefits of operating storage systems with actually allocateable space. In the end this boils down to a bet about the current and future compressability of the data actually stored. Besides licensing terms another issue for us was, that we unfortunately pretty quickly reached the limit for the allowed number of compressed VDisks per I/O group on our current 2145-CG8 hardware (max. 200 RTC enabled VDisks). This limit is put in place by SVC development as a safeguard, supposedly in order to keep the nodes from being overloaded with RTC workload. While understandable from a technical point of view, this limit can be reached pretty quickly if you're for example using a lot RTC enabled rootvgs or if you're following the SAP or Oracle disk layout guidelines pretty closely. For the rootvgs and possibly test and development SAP or Oracle systems a consolidation of the system and data disks into fewer VDisks using AIX/VIOS shared storage pools could be an option. But then again, the same downsides for e.g. VMware datastores apply to shared storage pools as well.
Links & Resources
Real-time Compression in SAN Volume Controller and Storwize V7000
Implementing IBM Real-time Compression in SAN Volume Controller and IBM Storwize V7000
IBM System Storage SAN Volume Controller software V6.4.0 improves storage efficiency - IBM Europe, Middle East, and Africa Software Announcement ZP12-0235 - June 4, 2012
Detailed per VDisk RTC data
Detailed compilation of storage saving with RTC on a per VDisk basis. If JavaScript is enabled in your browser, click on a column headers to sort by that particular column.
Application Type | VDisk Name | Type | Size (GB) | Compressed Size (GB) | Size (%) | Saving (%) | Ext. HDD | Ext. SSD |
---|---|---|---|---|---|---|---|---|
AIX rootvg | sas-8G-197 | primary | 8.00 | 4.00 | 50.00 | 50.00 | 16 | 0 |
AIX rootvg | sas-8G-197 | secondary | 8.00 | 4.00 | 50.00 | 50.00 | 16 | 0 |
AIX rootvg | sas-24G-155 | primary | 24.00 | 8.50 | 35.42 | 64.58 | 34 | 0 |
AIX rootvg | sas-24G-155 | secondary | 24.00 | 8.50 | 35.42 | 64.58 | 34 | 0 |
AIX rootvg | sas-16G-209 | primary | 16.00 | 4.00 | 25.00 | 75.00 | 16 | 0 |
AIX rootvg | sas-16G-209 | secondary | 16.00 | 4.00 | 25.00 | 75.00 | 16 | 0 |
AIX rootvg | sas-16G-2CA | primary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-2CA | secondary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-2C8 | primary | 16.00 | 2.75 | 17.19 | 82.81 | 11 | 0 |
AIX rootvg | sas-16G-2C8 | secondary | 16.00 | 2.75 | 17.19 | 82.81 | 11 | 0 |
AIX rootvg | sas-16G-2C9 | primary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-2C9 | secondary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-23E | primary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-23E | secondary | 16.00 | 2.50 | 15.62 | 84.38 | 10 | 0 |
AIX rootvg | sas-16G-2B5 | primary | 16.00 | 1.50 | 9.38 | 90.62 | 6 | 0 |
AIX rootvg | sas-16G-2B5 | secondary | 16.00 | 1.50 | 9.38 | 90.62 | 6 | 0 |
AIX rootvg | sas-16G-307 | primary | 16.00 | 11.75 | 73.44 | 26.56 | 47 | 0 |
AIX rootvg | sas-16G-307 | secondary | 16.00 | 11.75 | 73.44 | 26.56 | 47 | 0 |
AIX rootvg | sas-16G-2AA | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-2AA | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-8G-19A | primary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-19A | secondary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-241 | primary | 8.00 | 1.75 | 21.88 | 78.12 | 7 | 0 |
AIX rootvg | sas-8G-241 | secondary | 8.00 | 1.75 | 21.88 | 78.12 | 7 | 0 |
AIX rootvg | sas-8G-2BD | primary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-2BD | secondary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-2BE | primary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-2BE | secondary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-194 | primary | 8.00 | 1.25 | 15.62 | 84.38 | 5 | 0 |
AIX rootvg | sas-8G-194 | secondary | 8.00 | 1.25 | 15.62 | 84.38 | 5 | 0 |
AIX rootvg | sas-8G-195 | primary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-195 | secondary | 8.00 | 1.50 | 18.75 | 81.25 | 6 | 0 |
AIX rootvg | sas-8G-22F | primary | 8.00 | 3.75 | 46.88 | 53.12 | 15 | 0 |
AIX rootvg | sas-8G-22F | secondary | 8.00 | 3.75 | 46.88 | 53.12 | 15 | 0 |
AIX rootvg | sas-16G-264 | primary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-264 | secondary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-263 | primary | 16.00 | 7.00 | 43.75 | 56.25 | 28 | 0 |
AIX rootvg | sas-16G-263 | secondary | 16.00 | 7.00 | 43.75 | 56.25 | 28 | 0 |
AIX rootvg | sas-16G-1D2 | primary | 16.00 | 7.00 | 43.75 | 56.25 | 0 | 28 |
AIX rootvg | sas-16G-1D2 | secondary | 16.00 | 7.00 | 43.75 | 56.25 | 28 | 0 |
AIX rootvg | sas-16G-1B1 | primary | 16.00 | 5.50 | 34.38 | 65.62 | 0 | 22 |
AIX rootvg | sas-16G-1B1 | secondary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-14D | primary | 16.00 | 5.75 | 35.94 | 64.06 | 0 | 23 |
AIX rootvg | sas-16G-14D | secondary | 16.00 | 5.75 | 35.94 | 64.06 | 23 | 0 |
AIX rootvg | sas-32G-322 | primary | 32.00 | 8.50 | 26.56 | 73.44 | 34 | 0 |
AIX rootvg | sas-32G-322 | secondary | 32.00 | 8.50 | 26.56 | 73.44 | 34 | 0 |
AIX rootvg | flash-32G-1F3 | primary | 32.00 | 11.75 | 36.72 | 63.28 | 0 | 47 |
AIX rootvg | flash-32G-1F3 | secondary | 32.00 | 11.75 | 36.72 | 63.28 | 47 | 0 |
AIX rootvg | sas-16G-1F1 | primary | 16.00 | 7.50 | 46.88 | 53.12 | 0 | 30 |
AIX rootvg | sas-16G-1F1 | secondary | 16.00 | 7.50 | 46.88 | 53.12 | 30 | 0 |
AIX rootvg | sas-16G-15C | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-15C | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-32G-1F4 | primary | 32.00 | 8.25 | 25.78 | 74.22 | 33 | 0 |
AIX rootvg | sas-32G-1F4 | secondary | 32.00 | 8.25 | 25.78 | 74.22 | 33 | 0 |
AIX rootvg | sas-16G-267 | primary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-267 | secondary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-2F4 | primary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-2F4 | secondary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-2FB | primary | 16.00 | 8.75 | 54.69 | 45.31 | 35 | 0 |
AIX rootvg | sas-16G-2FB | secondary | 16.00 | 8.75 | 54.69 | 45.31 | 35 | 0 |
AIX rootvg | sas-16G-11F | primary | 16.00 | 7.00 | 43.75 | 56.25 | 28 | 0 |
AIX rootvg | sas-16G-11F | secondary | 16.00 | 7.00 | 43.75 | 56.25 | 28 | 0 |
AIX rootvg | sas-16G-265 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-265 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-146 | primary | 16.00 | 6.25 | 39.06 | 60.94 | 25 | 0 |
AIX rootvg | sas-16G-146 | secondary | 16.00 | 6.25 | 39.06 | 60.94 | 25 | 0 |
AIX rootvg | sas-16G-28C | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-28C | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-28E | primary | 16.00 | 9.75 | 60.94 | 39.06 | 39 | 0 |
AIX rootvg | sas-16G-28E | secondary | 16.00 | 9.75 | 60.94 | 39.06 | 39 | 0 |
AIX rootvg | sas-16G-205 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-205 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-1BD | primary | 16.00 | 10.00 | 62.50 | 37.50 | 40 | 0 |
AIX rootvg | sas-16G-1BD | secondary | 16.00 | 10.00 | 62.50 | 37.50 | 40 | 0 |
AIX rootvg | sas-16G-28F | primary | 16.00 | 10.25 | 64.06 | 35.94 | 41 | 0 |
AIX rootvg | sas-16G-28F | secondary | 16.00 | 10.25 | 64.06 | 35.94 | 41 | 0 |
AIX rootvg | sas-16G-162 | primary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-162 | secondary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-2A9 | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-2A9 | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-1B9 | primary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-1B9 | secondary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-1BA | primary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-1BA | secondary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-1BC | primary | 16.00 | 9.75 | 60.94 | 39.06 | 39 | 0 |
AIX rootvg | sas-16G-1BC | secondary | 16.00 | 9.75 | 60.94 | 39.06 | 39 | 0 |
AIX rootvg | sas-16G-1BB | primary | 16.00 | 11.50 | 71.88 | 28.12 | 46 | 0 |
AIX rootvg | sas-16G-1BB | secondary | 16.00 | 11.50 | 71.88 | 28.12 | 46 | 0 |
AIX rootvg | sas-16G-1A9 | primary | 16.00 | 3.50 | 21.88 | 78.12 | 14 | 0 |
AIX rootvg | sas-16G-1A9 | secondary | 16.00 | 3.50 | 21.88 | 78.12 | 14 | 0 |
AIX rootvg | sas-16G-2CE | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-2CE | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-29B | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-29B | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-1BF | primary | 16.00 | 8.25 | 51.56 | 48.44 | 33 | 0 |
AIX rootvg | sas-16G-1BF | secondary | 16.00 | 8.25 | 51.56 | 48.44 | 33 | 0 |
AIX rootvg | sas-16G-243 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-243 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-2CF | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-2CF | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-29C | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-29C | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-242 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-242 | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-29E | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-29E | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-2D0 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-2D0 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-29D | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-29D | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-1BE | primary | 16.00 | 8.25 | 51.56 | 48.44 | 33 | 0 |
AIX rootvg | sas-16G-1BE | secondary | 16.00 | 8.25 | 51.56 | 48.44 | 33 | 0 |
AIX rootvg | sas-16G-29F | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-29F | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-261 | primary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-261 | secondary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-8G-199 | primary | 8.00 | 1.25 | 15.62 | 84.38 | 5 | 0 |
AIX rootvg | sas-8G-199 | secondary | 8.00 | 1.25 | 15.62 | 84.38 | 5 | 0 |
AIX rootvg | sas-8G-240 | primary | 8.00 | 3.00 | 37.50 | 62.50 | 12 | 0 |
AIX rootvg | sas-8G-240 | secondary | 8.00 | 3.00 | 37.50 | 62.50 | 12 | 0 |
AIX rootvg | sas-16G-249 | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-249 | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-24G-15D | primary | 24.00 | 5.00 | 20.83 | 79.17 | 20 | 0 |
AIX rootvg | sas-24G-15D | secondary | 24.00 | 5.00 | 20.83 | 79.17 | 20 | 0 |
AIX rootvg | sas-16G-24B | primary | 16.00 | 4.25 | 26.56 | 73.44 | 17 | 0 |
AIX rootvg | sas-16G-24B | secondary | 16.00 | 4.25 | 26.56 | 73.44 | 17 | 0 |
AIX rootvg | sas-16G-24D | primary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-24D | secondary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-252 | primary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-252 | secondary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-278 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-278 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-279 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-279 | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-270 | primary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-270 | secondary | 16.00 | 5.25 | 32.81 | 67.19 | 21 | 0 |
AIX rootvg | sas-16G-284 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-284 | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-285 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-285 | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-27A | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-27A | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-2DC | primary | 16.00 | 6.25 | 39.06 | 60.94 | 25 | 0 |
AIX rootvg | sas-16G-2DC | secondary | 16.00 | 6.25 | 39.06 | 60.94 | 25 | 0 |
AIX rootvg | sas-16G-27B | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-27B | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-310 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-310 | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
AIX rootvg | sas-16G-312 | primary | 16.00 | 10.00 | 62.50 | 37.50 | 40 | 0 |
AIX rootvg | sas-16G-312 | secondary | 16.00 | 10.00 | 62.50 | 37.50 | 40 | 0 |
AIX rootvg | sas-36G-22B | primary | 36.00 | 9.00 | 25.00 | 75.00 | 36 | 0 |
AIX rootvg | sas-36G-22B | secondary | 36.00 | 9.00 | 25.00 | 75.00 | 36 | 0 |
AIX rootvg | sas-16G-22D | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-22D | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-308 | primary | 16.00 | 6.75 | 42.19 | 57.81 | 23 | 4 |
AIX rootvg | sas-16G-308 | secondary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-233 | primary | 16.00 | 8.75 | 54.69 | 45.31 | 35 | 0 |
AIX rootvg | sas-16G-233 | secondary | 16.00 | 8.75 | 54.69 | 45.31 | 35 | 0 |
AIX rootvg | sas-16G-30A | primary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-30A | secondary | 16.00 | 6.75 | 42.19 | 57.81 | 27 | 0 |
AIX rootvg | sas-16G-226 | primary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-226 | secondary | 16.00 | 6.00 | 37.50 | 62.50 | 24 | 0 |
AIX rootvg | sas-16G-0F1 | primary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-0F1 | secondary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-24G-1A4 | primary | 24.00 | 7.50 | 31.25 | 68.75 | 30 | 0 |
AIX rootvg | sas-24G-1A4 | secondary | 24.00 | 7.50 | 31.25 | 68.75 | 30 | 0 |
AIX rootvg | sas-16G-1AE | primary | 16.00 | 4.50 | 28.12 | 71.88 | 17 | 1 |
AIX rootvg | sas-16G-1AE | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-24G-1A5 | primary | 24.00 | 9.75 | 40.62 | 59.38 | 39 | 0 |
AIX rootvg | sas-24G-1A5 | secondary | 24.00 | 9.75 | 40.62 | 59.38 | 39 | 0 |
AIX rootvg | sas-24G-1A6 | primary | 24.00 | 8.75 | 36.46 | 63.54 | 35 | 0 |
AIX rootvg | sas-24G-1A6 | secondary | 24.00 | 8.75 | 36.46 | 63.54 | 35 | 0 |
AIX rootvg | sas-16G-259 | primary | 16.00 | 3.25 | 20.31 | 79.69 | 13 | 0 |
AIX rootvg | sas-16G-259 | secondary | 16.00 | 3.25 | 20.31 | 79.69 | 13 | 0 |
AIX rootvg | sas-16G-19E | primary | 16.00 | 2.75 | 17.19 | 82.81 | 11 | 0 |
AIX rootvg | sas-16G-19E | secondary | 16.00 | 2.75 | 17.19 | 82.81 | 11 | 0 |
AIX rootvg | sas-24G-257 | primary | 24.00 | 4.50 | 18.75 | 81.25 | 18 | 0 |
AIX rootvg | sas-24G-257 | secondary | 24.00 | 4.50 | 18.75 | 81.25 | 18 | 0 |
AIX rootvg | sas-16G-343 | primary | 16.00 | 2.00 | 12.50 | 87.50 | 8 | 0 |
AIX rootvg | sas-16G-343 | secondary | 16.00 | 2.00 | 12.50 | 87.50 | 8 | 0 |
AIX rootvg | sas-16G-212 | primary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-212 | secondary | 16.00 | 5.00 | 31.25 | 68.75 | 20 | 0 |
AIX rootvg | sas-16G-214 | primary | 16.00 | 4.00 | 25.00 | 75.00 | 16 | 0 |
AIX rootvg | sas-16G-214 | secondary | 16.00 | 4.00 | 25.00 | 75.00 | 16 | 0 |
AIX rootvg | sas-16G-213 | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-213 | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-1E7 | primary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-16G-1E7 | secondary | 16.00 | 5.50 | 34.38 | 65.62 | 22 | 0 |
AIX rootvg | sas-24G-1E0 | primary | 24.00 | 3.75 | 15.62 | 84.38 | 15 | 0 |
AIX rootvg | sas-24G-1E0 | secondary | 24.00 | 3.50 | 14.58 | 85.42 | 14 | 0 |
AIX rootvg | sas-16G-1E2 | primary | 16.00 | 4.25 | 26.56 | 73.44 | 17 | 0 |
AIX rootvg | sas-16G-1E2 | secondary | 16.00 | 4.25 | 26.56 | 73.44 | 17 | 0 |
AIX rootvg | sas-24G-305 | primary | 24.00 | 7.00 | 29.17 | 70.83 | 21 | 7 |
AIX rootvg | sas-24G-305 | secondary | 24.00 | 7.00 | 29.17 | 70.83 | 28 | 0 |
AIX rootvg | sas-24G-306 | primary | 24.00 | 6.75 | 28.12 | 71.88 | 22 | 5 |
AIX rootvg | sas-24G-306 | secondary | 24.00 | 6.75 | 28.12 | 71.88 | 27 | 0 |
AIX rootvg | sas-16G-323 | primary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
AIX rootvg | sas-16G-323 | secondary | 16.00 | 4.50 | 28.12 | 71.88 | 18 | 0 |
Apache & MySQL | sas-16G-13A | primary | 16.00 | 7.75 | 48.44 | 51.56 | 24 | 7 |
Apache & MySQL | sas-16G-13A | secondary | 16.00 | 7.75 | 48.44 | 51.56 | 31 | 0 |
Apache & MySQL | sas-16G-18D | primary | 16.00 | 9.75 | 60.94 | 39.06 | 38 | 1 |
Apache & MySQL | sas-16G-18D | secondary | 16.00 | 9.75 | 60.94 | 39.06 | 39 | 0 |
Apache & MySQL | sas-16G-303 | primary | 16.00 | 11.00 | 68.75 | 31.25 | 44 | 0 |
Apache & MySQL | sas-16G-303 | secondary | 16.00 | 11.00 | 68.75 | 31.25 | 44 | 0 |
Apache & MySQL | sas-16G-2C3 | primary | 16.00 | 13.50 | 84.38 | 15.62 | 45 | 9 |
Apache & MySQL | sas-16G-2C3 | secondary | 16.00 | 13.50 | 84.38 | 15.62 | 54 | 0 |
Apache & MySQL | sas-32G-2C1 | primary | 32.00 | 17.50 | 54.69 | 45.31 | 33 | 37 |
Apache & MySQL | sas-32G-2C1 | secondary | 32.00 | 17.50 | 54.69 | 45.31 | 70 | 0 |
Apache & MySQL | sas-32G-2C2 | primary | 32.00 | 18.00 | 56.25 | 43.75 | 36 | 36 |
Apache & MySQL | sas-32G-2C2 | secondary | 32.00 | 18.00 | 56.25 | 43.75 | 72 | 0 |
Apache & MySQL | sas-16G-2D8 | primary | 16.00 | 2.75 | 17.19 | 82.81 | 8 | 3 |
Apache & MySQL | sas-16G-2D8 | secondary | 16.00 | 2.75 | 17.19 | 82.81 | 11 | 0 |
Apache & MySQL | sas-16G-2D9 | primary | 16.00 | 4.75 | 29.69 | 70.31 | 16 | 3 |
Apache & MySQL | sas-16G-2D9 | secondary | 16.00 | 4.75 | 29.69 | 70.31 | 19 | 0 |
Apache & MySQL | sas-16G-304 | primary | 16.00 | 10.25 | 64.06 | 35.94 | 33 | 8 |
Apache & MySQL | sas-16G-304 | secondary | 16.00 | 10.25 | 64.06 | 35.94 | 41 | 0 |
Code Server | sas-200G-17E | primary | 200.00 | 64.50 | 32.25 | 67.75 | 202 | 56 |
Code Server | sas-200G-17E | secondary | 200.00 | 64.50 | 32.25 | 67.75 | 258 | 0 |
Fileserver | sas-350G-1D5 | primary | 350.00 | 142.50 | 40.71 | 59.29 | 351 | 219 |
Fileserver | sas-350G-1D5 | secondary | 350.00 | 142.75 | 40.79 | 59.21 | 571 | 0 |
Linux Package | sas-250G-23C | primary | 250.00 | 152.75 | 61.10 | 38.90 | 468 | 143 |
Linux Package | sas-250G-23C | secondary | 250.00 | 152.75 | 61.10 | 38.90 | 611 | 0 |
MS SQL | flash-120G-10A | primary | 120.00 | 18.50 | 15.42 | 84.58 | 0 | 74 |
MS SQL | flash-120G-10A | secondary | 120.00 | 18.50 | 15.42 | 84.58 | 74 | 0 |
Database | sas-100G-1F8 | primary | 100.00 | 23.75 | 23.75 | 76.25 | 82 | 13 |
Database | sas-100G-1F8 | secondary | 100.00 | 23.75 | 23.75 | 76.25 | 95 | 0 |
Database | sas-100G-2CD | primary | 100.00 | 12.00 | 12.00 | 88.00 | 31 | 17 |
Database | sas-100G-2CD | secondary | 100.00 | 12.00 | 12.00 | 88.00 | 48 | 0 |
Database | sas-100G-2CB | primary | 100.00 | 14.50 | 14.50 | 85.50 | 45 | 13 |
Database | sas-100G-2CB | secondary | 100.00 | 14.50 | 14.50 | 85.50 | 58 | 0 |
Database | sas-100G-2CC | primary | 100.00 | 11.25 | 11.25 | 88.75 | 31 | 14 |
Database | sas-100G-2CC | secondary | 100.00 | 11.25 | 11.25 | 88.75 | 45 | 0 |
Database | flash-150G-23F | primary | 150.00 | 30.25 | 20.17 | 79.83 | 0 | 121 |
Database | flash-150G-23F | secondary | 150.00 | 30.25 | 20.17 | 79.83 | 121 | 0 |
Database | sas-150G-30E | primary | 150.00 | 88.75 | 59.17 | 40.83 | 297 | 58 |
Database | sas-150G-30E | secondary | 150.00 | 88.75 | 59.17 | 40.83 | 355 | 0 |
Database | sas-600G-30D | primary | 600.00 | 104.75 | 17.46 | 82.54 | 108 | 311 |
Database | sas-600G-30D | secondary | 600.00 | 105.00 | 17.50 | 82.50 | 420 | 0 |
Database | sas-150G-32D | primary | 150.00 | 19.00 | 12.67 | 87.33 | 45 | 31 |
Database | sas-150G-32D | secondary | 150.00 | 19.00 | 12.67 | 87.33 | 76 | 0 |
Database | sas-600G-32C | primary | 600.00 | 107.75 | 17.96 | 82.04 | 54 | 377 |
Database | sas-600G-32C | secondary | 600.00 | 107.75 | 17.96 | 82.04 | 431 | 0 |
Database | sas-120G-147 | primary | 120.00 | 49.25 | 41.04 | 58.96 | 167 | 30 |
Database | sas-120G-147 | secondary | 120.00 | 49.25 | 41.04 | 58.96 | 197 | 0 |
Database | sas-170G-28D | primary | 170.00 | 52.75 | 31.03 | 68.97 | 88 | 123 |
Database | sas-170G-28D | secondary | 170.00 | 52.75 | 31.03 | 68.97 | 211 | 0 |
Database | sas-300G-290 | primary | 300.00 | 97.00 | 32.33 | 67.67 | 315 | 73 |
Database | sas-300G-290 | secondary | 300.00 | 97.00 | 32.33 | 67.67 | 388 | 0 |
Database | sas-1550G-1C6 | primary | 1550.00 | 355.25 | 22.92 | 77.08 | 119 | 1302 |
Database | sas-1550G-1C6 | secondary | 1550.00 | 355.25 | 22.92 | 77.08 | 1421 | 0 |
Database | sas-150G-1C8 | primary | 150.00 | 31.25 | 20.83 | 79.17 | 45 | 80 |
Database | sas-150G-1C8 | secondary | 150.00 | 31.25 | 20.83 | 79.17 | 125 | 0 |
Database | sas-100G-291 | primary | 100.00 | 26.75 | 26.75 | 73.25 | 100 | 7 |
Database | sas-100G-291 | secondary | 100.00 | 26.75 | 26.75 | 73.25 | 107 | 0 |
Database | sas-900G-296 | primary | 900.00 | 309.75 | 34.42 | 65.58 | 123 | 1116 |
Database | sas-900G-296 | secondary | 900.00 | 309.50 | 34.39 | 65.61 | 1238 | 0 |
Database | sas-1000G-2AD | primary | 1000.00 | 73.25 | 7.32 | 92.67 | 283 | 10 |
Database | sas-1000G-2AD | secondary | 1000.00 | 73.25 | 7.32 | 92.67 | 293 | 0 |
Database | sas-1300G-1C2 | primary | 1300.00 | 299.50 | 23.04 | 76.96 | 185 | 1013 |
Database | sas-1300G-1C2 | secondary | 1300.00 | 299.50 | 23.04 | 76.96 | 1198 | 0 |
Database | sas-2150G-294 | primary | 2150.00 | 660.50 | 30.72 | 69.28 | 206 | 2436 |
Database | sas-2150G-294 | secondary | 2150.00 | 660.75 | 30.73 | 69.27 | 2643 | 0 |
Database | sas-150G-297 | primary | 150.00 | 44.50 | 29.67 | 70.33 | 73 | 105 |
Database | sas-150G-297 | secondary | 150.00 | 44.50 | 29.67 | 70.33 | 178 | 0 |
Database | sas-600G-1C5 | primary | 600.00 | 167.00 | 27.83 | 72.17 | 373 | 295 |
Database | sas-600G-1C5 | secondary | 600.00 | 167.25 | 27.88 | 72.12 | 669 | 0 |
Database | sas-200G-24A | primary | 200.00 | 52.00 | 26.00 | 74.00 | 120 | 88 |
Database | sas-200G-24A | secondary | 200.00 | 52.00 | 26.00 | 74.00 | 208 | 0 |
Database | flash-160G-24C | primary | 160.00 | 74.75 | 46.72 | 53.28 | 0 | 299 |
Database | flash-160G-24C | secondary | 160.00 | 74.75 | 46.72 | 53.28 | 299 | 0 |
Database | flash-260G-24E | primary | 260.00 | 74.25 | 28.56 | 71.44 | 0 | 297 |
Database | flash-260G-24E | secondary | 260.00 | 74.25 | 28.56 | 71.44 | 297 | 0 |
Database | sas-120G-253 | primary | 120.00 | 31.50 | 26.25 | 73.75 | 28 | 98 |
Database | sas-120G-253 | secondary | 120.00 | 31.25 | 26.04 | 73.96 | 125 | 0 |
Database | sas-250G-27C | primary | 250.00 | 53.00 | 21.20 | 78.80 | 157 | 55 |
Database | sas-250G-27C | secondary | 250.00 | 53.00 | 21.20 | 78.80 | 212 | 0 |
Database | flash-350G-27D | primary | 350.00 | 79.75 | 22.79 | 77.21 | 0 | 319 |
Database | flash-350G-27D | secondary | 350.00 | 79.75 | 22.79 | 77.21 | 319 | 0 |
Database | sas-180G-271 | primary | 180.00 | 51.75 | 28.75 | 71.25 | 98 | 109 |
Database | sas-180G-271 | secondary | 180.00 | 52.00 | 28.89 | 71.11 | 208 | 0 |
Database | sas-260G-286 | primary | 260.00 | 56.75 | 21.83 | 78.17 | 51 | 176 |
Database | sas-260G-286 | secondary | 260.00 | 56.75 | 21.83 | 78.17 | 227 | 0 |
Database | flash-150G-287 | primary | 150.00 | 34.25 | 22.83 | 77.17 | 0 | 137 |
Database | flash-150G-287 | secondary | 150.00 | 34.25 | 22.83 | 77.17 | 137 | 0 |
Database | sas-180G-27E | primary | 180.00 | 39.75 | 22.08 | 77.92 | 144 | 15 |
Database | sas-180G-27E | secondary | 180.00 | 39.75 | 22.08 | 77.92 | 159 | 0 |
Database | sas-180G-2DD | primary | 180.00 | 50.25 | 27.92 | 72.08 | 143 | 58 |
Database | sas-180G-2DD | secondary | 180.00 | 50.25 | 27.92 | 72.08 | 201 | 0 |
Database | flash-180G-27F | primary | 180.00 | 43.50 | 24.17 | 75.83 | 0 | 174 |
Database | flash-180G-27F | secondary | 180.00 | 43.75 | 24.31 | 75.69 | 175 | 0 |
Database | sas-120G-311 | primary | 120.00 | 36.50 | 30.42 | 69.58 | 85 | 61 |
Database | sas-120G-311 | secondary | 120.00 | 36.50 | 30.42 | 69.58 | 146 | 0 |
Database | flash-120G-313 | primary | 120.00 | 47.00 | 39.17 | 60.83 | 0 | 188 |
Database | flash-120G-313 | secondary | 120.00 | 47.00 | 39.17 | 60.83 | 188 | 0 |
Database | sas-140G-22A | primary | 140.00 | 49.50 | 35.36 | 64.64 | 80 | 118 |
Database | sas-140G-22A | secondary | 140.00 | 49.75 | 35.54 | 64.46 | 199 | 0 |
Database | sas-150G-22C | primary | 150.00 | 62.00 | 41.33 | 58.67 | 246 | 2 |
Database | sas-150G-22C | secondary | 150.00 | 62.00 | 41.33 | 58.67 | 248 | 0 |
Database | sas-80G-309 | primary | 80.00 | 17.75 | 22.19 | 77.81 | 47 | 24 |
Database | sas-80G-309 | secondary | 80.00 | 17.75 | 22.19 | 77.81 | 71 | 0 |
Database | sas-36G-30B | primary | 36.00 | 18.00 | 50.00 | 50.00 | 72 | 0 |
Database | sas-36G-30B | secondary | 36.00 | 18.00 | 50.00 | 50.00 | 72 | 0 |
Database | sas-200G-344 | primary | 200.00 | 10.75 | 5.38 | 94.62 | 43 | 0 |
Database | sas-200G-344 | secondary | 200.00 | 10.75 | 5.38 | 94.62 | 43 | 0 |
ERP | sas-250G-2C6 | primary | 250.00 | 76.50 | 30.60 | 69.40 | 110 | 196 |
ERP | sas-250G-2C6 | secondary | 250.00 | 76.50 | 30.60 | 69.40 | 306 | 0 |
ERP | sas-250G-10E | primary | 250.00 | 77.50 | 31.00 | 69.00 | 110 | 200 |
ERP | sas-250G-10E | secondary | 250.00 | 77.50 | 31.00 | 69.00 | 310 | 0 |
ERP | sas-2750G-345 | primary | 2750.00 | 803.50 | 29.22 | 70.78 | 3214 | 0 |
ERP | flash-2750G-1B5 | primary | 2750.00 | 803.75 | 29.23 | 70.77 | 0 | 3215 |
ERP | flash-2750G-1B5 | secondary | 2750.00 | 803.50 | 29.22 | 70.78 | 3214 | 0 |
ERP | sas-150G-2D1 | primary | 150.00 | 41.25 | 27.50 | 72.50 | 126 | 39 |
ERP | sas-150G-2D1 | secondary | 150.00 | 41.25 | 27.50 | 72.50 | 165 | 0 |
ERP | sas-100G-2A0 | primary | 100.00 | 27.75 | 27.75 | 72.25 | 92 | 19 |
ERP | sas-100G-2A0 | secondary | 100.00 | 27.75 | 27.75 | 72.25 | 111 | 0 |
ERP | sas-130G-2A3 | primary | 130.00 | 53.00 | 40.77 | 59.23 | 77 | 135 |
ERP | sas-130G-2A3 | secondary | 130.00 | 53.00 | 40.77 | 59.23 | 212 | 0 |
ERP | sas-300G-2A6 | primary | 300.00 | 63.75 | 21.25 | 78.75 | 183 | 72 |
ERP | sas-300G-2A6 | secondary | 300.00 | 63.75 | 21.25 | 78.75 | 255 | 0 |
ERP | sas-150G-2D2 | primary | 150.00 | 38.50 | 25.67 | 74.33 | 98 | 56 |
ERP | sas-150G-2D2 | secondary | 150.00 | 38.50 | 25.67 | 74.33 | 154 | 0 |
ERP | sas-200G-2A1 | primary | 200.00 | 61.50 | 30.75 | 69.25 | 228 | 18 |
ERP | sas-200G-2A1 | secondary | 200.00 | 61.25 | 30.63 | 69.38 | 245 | 0 |
ERP | sas-250G-2A4 | primary | 250.00 | 108.00 | 43.20 | 56.80 | 60 | 372 |
ERP | sas-250G-2A4 | secondary | 250.00 | 108.00 | 43.20 | 56.80 | 432 | 0 |
ERP | sas-450G-2A7 | primary | 450.00 | 151.75 | 33.72 | 66.28 | 325 | 282 |
ERP | sas-450G-2A7 | secondary | 450.00 | 151.75 | 33.72 | 66.28 | 607 | 0 |
ERP | sas-700G-2D3 | primary | 700.00 | 228.00 | 32.57 | 67.43 | 61 | 851 |
ERP | sas-700G-2D3 | secondary | 700.00 | 228.00 | 32.57 | 67.43 | 912 | 0 |
ERP | sas-650G-2A2 | primary | 650.00 | 278.50 | 42.85 | 57.15 | 94 | 1020 |
ERP | sas-650G-2A2 | secondary | 650.00 | 278.50 | 42.85 | 57.15 | 1114 | 0 |
ERP | sas-320G-2A5 | primary | 320.00 | 153.75 | 48.05 | 51.95 | 69 | 546 |
ERP | sas-320G-2A5 | secondary | 320.00 | 153.75 | 48.05 | 51.95 | 615 | 0 |
ERP | sas-550G-2A8 | primary | 550.00 | 207.25 | 37.68 | 62.32 | 634 | 195 |
ERP | sas-550G-2A8 | secondary | 550.00 | 207.50 | 37.73 | 62.27 | 830 | 0 |
ERP | sas-340G-248 | primary | 340.00 | 87.00 | 25.59 | 74.41 | 340 | 8 |
ERP | sas-340G-248 | secondary | 340.00 | 87.00 | 25.59 | 74.41 | 348 | 0 |
ERP | sas-270G-2D4 | primary | 270.00 | 87.00 | 32.22 | 67.78 | 72 | 276 |
ERP | sas-270G-2D4 | secondary | 270.00 | 86.75 | 32.13 | 67.87 | 347 | 0 |
ERP | sas-340G-1AF | primary | 340.00 | 79.50 | 23.38 | 76.62 | 318 | 0 |
ERP | sas-340G-1AF | secondary | 340.00 | 79.50 | 23.38 | 76.62 | 318 | 0 |
ERP | sas-300G-2D5 | primary | 300.00 | 80.25 | 26.75 | 73.25 | 66 | 255 |
ERP | sas-300G-2D5 | secondary | 300.00 | 80.25 | 26.75 | 73.25 | 321 | 0 |
ERP | flash-370G-2D6 | primary | 370.00 | 118.50 | 32.03 | 67.97 | 0 | 474 |
ERP | flash-370G-2D6 | secondary | 370.00 | 118.50 | 32.03 | 67.97 | 474 | 0 |
ERP | sas-80G-101 | primary | 80.00 | 34.25 | 42.81 | 57.19 | 114 | 23 |
ERP | sas-80G-101 | secondary | 80.00 | 34.25 | 42.81 | 57.19 | 137 | 0 |
ERP | sas-260G-2E8 | primary | 260.00 | 90.00 | 34.62 | 65.38 | 310 | 50 |
ERP | sas-260G-2E8 | secondary | 260.00 | 90.00 | 34.62 | 65.38 | 360 | 0 |
ERP | sas-430G-11B | primary | 430.00 | 184.75 | 42.97 | 57.03 | 510 | 229 |
ERP | sas-430G-11B | secondary | 430.00 | 184.75 | 42.97 | 57.03 | 739 | 0 |
ERP | sas-460G-1D4 | primary | 460.00 | 179.75 | 39.08 | 60.92 | 669 | 50 |
ERP | sas-460G-1D4 | secondary | 460.00 | 179.75 | 39.08 | 60.92 | 719 | 0 |
ERP | sas-520G-236 | primary | 520.00 | 199.50 | 38.37 | 61.63 | 331 | 467 |
ERP | sas-520G-236 | secondary | 520.00 | 199.50 | 38.37 | 61.63 | 798 | 0 |
ERP | sas-170G-180 | primary | 170.00 | 62.00 | 36.47 | 63.53 | 204 | 44 |
ERP | sas-170G-180 | secondary | 170.00 | 62.00 | 36.47 | 63.53 | 248 | 0 |
ERP | sas-100G-14A | primary | 100.00 | 35.50 | 35.50 | 64.50 | 139 | 3 |
ERP | sas-100G-14A | secondary | 100.00 | 35.50 | 35.50 | 64.50 | 142 | 0 |
ERP | sas-210G-25B | primary | 210.00 | 84.75 | 40.36 | 59.64 | 157 | 182 |
ERP | sas-210G-25B | secondary | 210.00 | 84.75 | 40.36 | 59.64 | 339 | 0 |
ERP | sas-170G-22E | primary | 170.00 | 65.00 | 38.24 | 61.76 | 133 | 127 |
ERP | sas-170G-22E | secondary | 170.00 | 65.00 | 38.24 | 61.76 | 260 | 0 |
TSM OpCenter | sas-26G-301 | primary | 26.00 | 9.25 | 35.58 | 64.42 | 37 | 0 |
TSM OpCenter | sas-26G-301 | secondary | 26.00 | 9.25 | 35.58 | 64.42 | 37 | 0 |
VMware ESX | sas-1024G-1A7 | primary | 1024.00 | 290.25 | 28.34 | 71.66 | 1161 | 0 |
VMware ESX | sas-1024G-1A7 | secondary | 1024.00 | 290.25 | 28.34 | 71.66 | 1161 | 0 |
VMware ESX | sas-2000G-172 | primary | 2000.00 | 1030.75 | 51.54 | 48.46 | 2268 | 1855 |
VMware ESX | sas-2000G-172 | secondary | 2000.00 | 1030.75 | 51.54 | 48.46 | 4123 | 0 |
VMware ESX | sas-2000G-222 | primary | 2000.00 | 1074.50 | 53.73 | 46.27 | 2940 | 1358 |
VMware ESX | sas-2000G-222 | secondary | 2000.00 | 1074.50 | 53.73 | 46.27 | 4298 | 0 |
VMware ESX | sas-2000G-2B1 | primary | 2000.00 | 800.00 | 40.00 | 60.00 | 2632 | 568 |
VMware ESX | sas-2000G-2B1 | secondary | 2000.00 | 799.75 | 39.99 | 60.01 | 3199 | 0 |
Windows Package | sas-300G-1B8 | primary | 300.00 | 185.25 | 61.75 | 38.25 | 552 | 189 |
Windows Package | sas-300G-1B8 | secondary | 300.00 | 185.00 | 61.67 | 38.33 | 740 | 0 |
Xen Server | sas-512G-1AC | primary | 512.00 | 244.50 | 47.75 | 52.25 | 975 | 3 |
Xen Server | sas-512G-1AC | secondary | 512.00 | 238.50 | 46.58 | 53.42 | 954 | 0 |
Xen Server | sas-512G-1AD | primary | 512.00 | 212.25 | 41.46 | 58.54 | 750 | 99 |
Xen Server | sas-512G-1AD | secondary | 512.00 | 206.75 | 40.38 | 59.62 | 827 | 0 |
Xen Server | sas-512G-26D | primary | 512.00 | 175.50 | 34.28 | 65.72 | 661 | 41 |
Xen Server | sas-512G-26D | secondary | 512.00 | 170.50 | 33.30 | 66.70 | 682 | 0 |
Xen Server | sas-512G-283 | primary | 512.00 | 242.25 | 47.31 | 52.69 | 848 | 121 |
Xen Server | sas-512G-283 | secondary | 512.00 | 238.75 | 46.63 | 53.37 | 955 | 0 |
Xen Server | sas-512G-2B2 | primary | 512.00 | 147.75 | 28.86 | 71.14 | 12 | 579 |
Xen Server | sas-512G-2B2 | secondary | 512.00 | 141.50 | 27.64 | 72.36 | 566 | 0 |
Xen Server | sas-512G-2D7 | primary | 512.00 | 221.00 | 43.16 | 56.84 | 35 | 849 |
Xen Server | sas-512G-2D7 | secondary | 512.00 | 219.50 | 42.87 | 57.13 | 878 | 0 |
Xen Server | sas-512G-2E9 | primary | 512.00 | 151.25 | 29.54 | 70.46 | 600 | 5 |
Xen Server | sas-512G-2E9 | secondary | 512.00 | 146.50 | 28.61 | 71.39 | 586 | 0 |
Xen Server | sas-512G-332 | primary | 512.00 | 61.75 | 12.06 | 87.94 | 245 | 2 |
Xen Server | sas-512G-332 | secondary | 512.00 | 60.75 | 11.87 | 88.13 | 243 | 0 |
Xen Server | sas-512G-341 | primary | 512.00 | 225.50 | 44.04 | 55.96 | 854 | 48 |
Xen Server | sas-512G-341 | secondary | 512.00 | 219.00 | 42.77 | 57.23 | 876 | 0 |
2013-12-28 // Nagios Monitoring - IBM SVC and Storwize
Some time ago i wrote a – rather crude – Nagios plugin to monitor IBM SAN Volume Controller (SVC) systems. The plugin was initially targeted at version 4.3.x of the SVC software on 2145-8F2 nodes, we used back then. Since the initial implementation of the plugin we upgraded the hard- and software of our SVC systems several times and are now at version 7.1.x of the SVC software on 2145-CG8 nodes. Recently we also got some IBM Storwize V3700 storage arrays, which share the same code as the SVC, but are missing some of the features and provide additional other features. A code and functional review of the original plugin for the SVC as well as an adaption for the Storwize arrays seemed to be in order. The result were the two plugins check_ibm_svc.pl
and check_ibm_storwize.pl
. They share a lot of common code with the original plugin, but are still maintained seperately for the simple reason that IBM might develop the SVC and the Storwize code in slightly different, incompatible directions.
In order to run the plugins, you need to have the command line tool wbemcli
from the Standards Based Linux Instrumentation project installed on the Nagios system. In my case the wbemcli
command line tool is placed in /opt/sblim-wbemcli/bin/wbemcli
. If you use a different path, adapt the configuration hash entry “%conf{'wbemcli'}
” according to your environment. The plugins use wbemcli
to query the CIMOM service on the SVC or Storwize system for the necessary information. Therefor a network connection from the Nagios system to the SVC or Storwize systems on port TCP/5989 must be allowed and a user with the “Monitor
” authorization must be created on the SVC or Storwize systems:
IBM_2145:svc:admin$ mkuser -name nagios -usergrp Monitor -password <password>
Or in the WebUI:
Generic
Optional: Enable SNMP traps to be sent to the Nagios system on each of the SVC or Storwize device. This requires SNMPD and SNMPTT to be already setup on the Nagios system. Login to the SVC or Storwize CLI and issue the command:
IBM_2145:svc:admin$ mksnmpserver -ip <IP adress> -community public -error on -warning on -info on -port 162
Where
<IP>
is the IP address of your Nagios system. Or in the SVC or Storwize WebUI navigate to:-> Settings -> Event Notifications -> SNMP -> <Enter IP of the Nagios system and the SNMPDs community string>
Verify the port UDP/162 on the Nagios system can be reached from the SVC or Storwize devices.
SAN Volume Controller (SVC)
For SAN Volume Controller (SVC) devices the whole setup looks like this:
Download the Nagios plugin check_ibm_svc.pl and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_ibm_svc.pl /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_ibm_svc.pl
Adjust the plugin settings according to your environment. Edit the following variable assignments:
my %conf = ( wbemcli => '/opt/sblim-wbemcli/bin/wbemcli',
Define the following Nagios commands. In this example this is done in the file
/etc/nagios-plugins/config/check_svc.cfg
:# check SVC Backend Controller status define command { command_name check_svc_bc command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C BackendController } # check SVC Backend SCSI Status define command { command_name check_svc_btspe command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C BackendTargetSCSIPE } # check SVC MDisk status define command { command_name check_svc_bv command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C BackendVolume } # check SVC Cluster status define command { command_name check_svc_cl command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C Cluster } # check SVC MDiskGroup status define command { command_name check_svc_csp command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C ConcreteStoragePool } # check SVC Ethernet Port status define command { command_name check_svc_eth command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C EthernetPort } # check SVC FC Port status define command { command_name check_svc_fcp command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C FCPort } # check SVC FC Port statistics define command { command_name check_svc_fcp_stats command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C FCPortStatistics } # check SVC I/O Group status and memory allocation define command { command_name check_svc_iogrp command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C IOGroup -w $ARG1$ -c $ARG2$ } # check SVC WebUI status define command { command_name check_svc_mc command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C MasterConsole } # check SVC VDisk Mirror status define command { command_name check_svc_mirror command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C MirrorExtent } # check SVC Node status define command { command_name check_svc_node command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C Node } # check SVC Quorum Disk status define command { command_name check_svc_quorum command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C QuorumDisk } # check SVC Storage Volume status define command { command_name check_svc_sv command_line $USER1$/check_ibm_svc.pl -H $HOSTNAME$ -u <user> -p <password> -C StorageVolume }
Replace
<user>
and<password>
with name and password of the CIMOM user created above.Define a group of services in your Nagios configuration to be checked for each SVC system:
# check sshd define service { use generic-service hostgroup_name svc service_description Check_SSH check_command check_ssh } # check_tcp CIMOM define service { use generic-service-pnp hostgroup_name svc service_description Check_CIMOM check_command check_tcp!5989 } # check_svc_bc define service { use generic-service-pnp hostgroup_name svc service_description Check_Backend_Controller check_command check_svc_bc } # check_svc_btspe define service { use generic-service-pnp hostgroup_name svc service_description Check_Backend_Target check_command check_svc_btspe } # check_svc_bv define service { use generic-service-pnp hostgroup_name svc service_description Check_Backend_Volume check_command check_svc_bv } # check_svc_cl define service { use generic-service-pnp hostgroup_name svc service_description Check_Cluster check_command check_svc_cl } # check_svc_csp define service { use generic-service-pnp hostgroup_name svc service_description Check_Storage_Pool check_command check_svc_csp } # check_svc_eth define service { use generic-service-pnp hostgroup_name svc service_description Check_Ethernet_Port check_command check_svc_eth } # check_svc_fcp define service { use generic-service-pnp hostgroup_name svc service_description Check_FC_Port check_command check_svc_fcp } # check_svc_fcp_stats define service { use generic-service-pnp hostgroup_name svc service_description Check_FC_Port_Statistics check_command check_svc_fcp_stats } # check_svc_iogrp define service { use generic-service-pnp hostgroup_name svc service_description Check_IO_Group check_command check_svc_iogrp!102400!204800 } # check_svc_mc define service { use generic-service-pnp hostgroup_name svc service_description Check_Master_Console check_command check_svc_mc } # check_svc_mirror define service { use generic-service-pnp hostgroup_name svc service_description Check_Mirror_Extents check_command check_svc_mirror } # check_svc_node define service { use generic-service-pnp hostgroup_name svc service_description Check_Node check_command check_svc_node } # check_svc_quorum define service { use generic-service-pnp hostgroup_name svc service_description Check_Quorum check_command check_svc_quorum } # check_svc_sv define service { use generic-service-pnp hostgroup_name svc service_description Check_Storage_Volume check_command check_svc_sv }
Replace
generic-service
with your Nagios service template. Replacegeneric-service-pnp
with your Nagios service template that has performance data processing enabled.Define hosts in your Nagios configuration for each SVC device. In this example its named
svc1
:define host { use svc host_name svc1 alias SAN Volume Controller 1 address 10.0.0.1 parents parent_lan }
Replace
svc
with your Nagios host template for SVC devices. Adjust theaddress
andparents
parameters according to your environment.Define a hostgroup in your Nagios configuration for all SVC systems. In this example it is named
svc
. The above checks are run against each member of the hostgroup:define hostgroup { hostgroup_name svc alias IBM SVC Clusters members svc1 }
Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
The new hosts and services should soon show up in the Nagios web interface.
Storwize
For Storwize devices the whole setup looks like this:
Download the Nagios plugin check_ibm_storwize.pl and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_ibm_storwize.pl /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_ibm_storwize.pl
Adjust the plugin settings according to your environment. Edit the following variable assignments:
my %conf = ( wbemcli => '/opt/sblim-wbemcli/bin/wbemcli',
Define the following Nagios commands. In this example this is done in the file
/etc/nagios-plugins/config/check_storwize.cfg
:# check Storwize RAID Array status define command { command_name check_storwize_array command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C Array } # check Storwize Hot Spare coverage define command { command_name check_storwize_asc command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C ArrayBasedOnDiskDrive } # check Storwize MDisk status define command { command_name check_storwize_bv command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C BackendVolume } # check Storwize Cluster status define command { command_name check_storwize_cl command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C Cluster } # check Storwize MDiskGroup status define command { command_name check_storwize_csp command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C ConcreteStoragePool } # check Storwize Disk status define command { command_name check_storwize_disk command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C DiskDrive } # check Storwize Enclosure status define command { command_name check_storwize_enc command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C Enclosure } # check Storwize Ethernet Port status define command { command_name check_storwize_eth command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C EthernetPort } # check Storwize FC Port status define command { command_name check_storwize_fcp command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C FCPort } # check Storwize I/O Group status and memory allocation define command { command_name check_storwize_iogrp command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C IOGroup -w $ARG1$ -c $ARG2$ } # check Storwize Hot Spare status define command { command_name check_storwize_is command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C IsSpare } # check Storwize WebUI status define command { command_name check_storwize_mc command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C MasterConsole } # check Storwize VDisk Mirror status define command { command_name check_storwize_mirror command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C MirrorExtent } # check Storwize Node status define command { command_name check_storwize_node command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C Node } # check Storwize Quorum Disk status define command { command_name check_storwize_quorum command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C QuorumDisk } # check Storwize Storage Volume status define command { command_name check_storwize_sv command_line $USER1$/check_ibm_storwize.pl -H $HOSTNAME$ -u <user> -p <password> -C StorageVolume }
Replace
<user>
and<password>
with name and password of the CIMOM user created above.Define a group of services in your Nagios configuration to be checked for each Storwize system:
# check sshd define service { use generic-service hostgroup_name storwize service_description Check_SSH check_command check_ssh } # check_tcp CIMOM define service { use generic-service-pnp hostgroup_name storwize service_description Check_CIMOM check_command check_tcp!5989 } # check_storwize_array define service { use generic-service-pnp hostgroup_name storwize service_description Check_Array check_command check_storwize_array } # check_storwize_asc define service { use generic-service-pnp hostgroup_name storwize service_description Check_Array_Spare_Coverage check_command check_storwize_asc } # check_storwize_bv define service { use generic-service-pnp hostgroup_name storwize service_description Check_Backend_Volume check_command check_storwize_bv } # check_storwize_cl define service { use generic-service-pnp hostgroup_name storwize service_description Check_Cluster check_command check_storwize_cl } # check_storwize_csp define service { use generic-service-pnp hostgroup_name storwize service_description Check_Storage_Pool check_command check_storwize_csp } # check_storwize_disk define service { use generic-service-pnp hostgroup_name storwize service_description Check_Disk_Drive check_command check_storwize_disk } # check_storwize_enc define service { use generic-service-pnp hostgroup_name storwize service_description Check_Enclosure check_command check_storwize_enc } # check_storwize_eth define service { use generic-service-pnp hostgroup_name storwize service_description Check_Ethernet_Port check_command check_storwize_eth } # check_storwize_fcp define service { use generic-service-pnp hostgroup_name storwize service_description Check_FC_Port check_command check_storwize_fcp } # check_storwize_iogrp define service { use generic-service-pnp hostgroup_name storwize service_description Check_IO_Group check_command check_storwize_iogrp!102400!204800 } # check_storwize_is define service { use generic-service-pnp hostgroup_name storwize service_description Check_Hot_Spare check_command check_storwize_is } # check_storwize_mc define service { use generic-service-pnp hostgroup_name storwize service_description Check_Master_Console check_command check_storwize_mc } # check_storwize_mirror define service { use generic-service-pnp hostgroup_name storwize service_description Check_Mirror_Extents check_command check_storwize_mirror } # check_storwize_node define service { use generic-service-pnp hostgroup_name storwize service_description Check_Node check_command check_storwize_node } # check_storwize_quorum define service { use generic-service-pnp hostgroup_name storwize service_description Check_Quorum check_command check_storwize_quorum } # check_storwize_sv define service { use generic-service-pnp hostgroup_name storwize service_description Check_Storage_Volume check_command check_storwize_sv }
Replace
generic-service
with your Nagios service template. Replacegeneric-service-pnp
with your Nagios service template that has performance data processing enabled.Define hosts in your Nagios configuration for each Storwize device. In this example its named
storwize1
:define host { use disk host_name storwize1 alias Storwize Disk Storage 1 address 10.0.0.1 parents parent_lan }
Replace
disk
with your Nagios host template for storage devices. Adjust theaddress
andparents
parameters according to your environment.Define a hostgroup in your Nagios configuration for all SVC systems. In this example it is named
storwize
. The above checks are run against each member of the hostgroup:define hostgroup { hostgroup_name storwize alias IBM Storwize Devices members storwize1 }
Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
The new hosts and services should soon show up in the Nagios web interface.
Generic
If the optional step in the “Generic” section above was done, SNMPTT also needs to be configured to be able to understand the incoming SNMP traps from Storwize systems. This can be achieved by the following steps:
Download the IBM SVC/Storwize SNMP MIB matching your software version from ftp://ftp.software.ibm.com/storage/san/sanvc/.
Convert the IBM SVC/Storwize SNMP MIB definitions in
SVC_MIB_<version>.MIB
into a format that SNMPTT can understand.$ /opt/snmptt/snmpttconvertmib --in=MIB/SVC_MIB_7.1.0.MIB --out=/opt/snmptt/conf/snmptt.conf.ibm-svc-710 ... Done Total translations: 3 Successful translations: 3 Failed translations: 0
Edit the trap severity according to your requirements, e.g.:
$ vim /opt/snmptt/conf/snmptt.conf.ibm-svc-710 ... EVENT tsveETrap .1.3.6.1.4.1.2.6.190.1 "Status Events" Critical ... EVENT tsveWTrap .1.3.6.1.4.1.2.6.190.2 "Status Events" Warning ...
Optional: Apply the following patch to the configuration to reduce the number of false positives:
- snmptt.conf.ibm-svc-710
-- /opt/snmptt/conf/snmptt.conf.ibm-svc-710.orig 2013-12-28 21:16:25.000000000 +0100 +++ /opt/snmptt/conf/snmptt.conf.ibm-svc-710 2013-12-28 21:17:55.000000000 +0100 @@ -29,11 +29,21 @@ 16: tsveMPNO 17: tsveOBJN EDESC +# Filter and ignore the following events that are not really warnings +# "Error ID = 980440": Failed to transfer file from remote node +# "Error ID = 981001": Cluster Fabric View updated by fabric discovery +# "Error ID = 981014": LUN Discovery failed +# "Error ID = 982009": Migration complete # +EVENT tsveWTrap .1.3.6.1.4.1.2.6.190.2 "Status Events" Normal +FORMAT tsve information trap $* +MATCH $3: (Error ID = 980440|981001|981014|982009) # +# All remaining events with this OID are actually warnings # EVENT tsveWTrap .1.3.6.1.4.1.2.6.190.2 "Status Events" Warning FORMAT tsve warning trap $* +MATCH $3: !(Error ID = 980440|981001|981014|982009) SDESC tsve warning trap Variables:
Add the new configuration file to be included in the global SNMPTT configuration and restart the SNMPTT daemon:
$ vim /opt/snmptt/snmptt.ini ... [TrapFiles] snmptt_conf_files = <<END ... /opt/snmptt/conf/snmptt.conf.ibm-svc-710 ... END $ /etc/init.d/snmptt reload
Download the Nagios plugin check_snmp_traps.sh and place it in the plugins directory of your Nagios system, in this example
/usr/lib/nagios/plugins/
:$ mv -i check_snmp_traps.sh /usr/lib/nagios/plugins/ $ chmod 755 /usr/lib/nagios/plugins/check_snmp_traps.sh
Define the following Nagios command to check for SNMP traps in the SNMPTT database. In this example this is done in the file
/etc/nagios-plugins/config/check_snmp_traps.cfg
:# check for snmp traps define command { command_name check_snmp_traps command_line $USER1$/check_snmp_traps.sh -H $HOSTNAME$:$HOSTADDRESS$ -u <user> -p <pass> -d <snmptt_db> }
Replace
user
,pass
andsnmptt_db
with values suitable for your SNMPTT database environment.Add another service in your Nagios configuration to be checked for each SVC:
# check snmptraps define service { use generic-service hostgroup_name svc service_description Check_SNMP_traps check_command check_snmp_traps }
or Storwize system:
# check snmptraps define service { use generic-service hostgroup_name storwize service_description Check_SNMP_traps check_command check_snmp_traps }
Optional: Define a serviceextinfo to display a folder icon next to the
Check_SNMP_traps
service check for each SVC:define serviceextinfo { hostgroup_name svc service_description Check_SNMP_traps notes SNMP Alerts #notes_url http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$ #notes_url http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$ }
or Storwize system:
define serviceextinfo { hostgroup_name storwize service_description Check_SNMP_traps notes SNMP Alerts #notes_url http://<hostname>/nagios3/nagtrap/index.php?hostname=$HOSTNAME$ #notes_url http://<hostname>/nagios3/nsti/index.php?perpage=100&hostname=$HOSTNAME$ }
device. This icon provides a direct link to the SNMPTT web interface with a filter for the selected host. Uncomment the
notes_url
depending on which web interface (nagtrap or nsti) is used. Replacehostname
with the FQDN or IP address of the server running the web interface.Run a configuration check and if successful reload the Nagios process:
$ /usr/sbin/nagios3 -v /etc/nagios3/nagios.cfg $ /etc/init.d/nagios3 reload
Optional: If you're running PNP4Nagios v0.6 or later to graph Nagios performance data, you can use the PNP4Nagios template in
pnp4nagios_storwize.tar.bz2
andpnp4nagios_svc.tar.bz2
to beautify the graphs. Download the PNP4Nagios templates pnp4nagios_svc.tar.bz2 and pnp4nagios_storwize.tar.bz2 and place them in the PNP4Nagios template directory, in this example/usr/share/pnp4nagios/html/templates/
:$ tar jxf pnp4nagios_storwize.tar.bz2 $ mv -i check_storwize_*.php /usr/share/pnp4nagios/html/templates/ $ chmod 644 /usr/share/pnp4nagios/html/templates/check_storwize_*.php $ tar jxf pnp4nagios_svc.tar.bz2 $ mv -i check_svc_*.php /usr/share/pnp4nagios/html/templates/ $ chmod 644 /usr/share/pnp4nagios/html/templates/check_svc_*.php
All done, you should now have a complete Nagios-based monitoring solution for your IBM SVC and Storwize systems.