bityard Blog

// IBM SVC and experiences with Real-time Compression - VMware and MS SQL Server

Following up on the previous article (IBM SVC and experiences with Real-time Compression) about our experiences with Real-time compression (RTC) in a IBM SVC environment, we did some additional and further testing with regard to RTC being used in a VMware environment hosting soley database instances of Microsoft SQL Server.

With the implementation of the seperation of all of our Microsoft SQL Server instances into an own VMware ESX cluster – which had to be done due to licensing constraints – an opportunity presented itself, to take a bit more detailed look into how well the SVCs RTC feature would perform on this particular type of workload.

Usually, the contents of a VMware datastore are by their very nature opaque to the lower levels of the storage hierarchy like the SVC which is providing the block storage for the datastores. On the one hand an advantage, this lack of transparency on the other hand also prohibits the SVC from providing a fine grained, per VM set of its usual statistics (e.g. the actual amount of storage used with thin-provisioning or RTC enabled, the amount of storage in the individual storage tier classes, etc.) and performance metrics. By not only moving the compute part of the VMs over to the new VMware ESX hosts in a seperate VMware ESX cluster, but also individually moving the storage part of one VM after the other onto new and empty datastores, which in turn were at the SVC based on newly created, empty VDisks with RTC enabled, we were able to measure the impact of RTC on a per VM level. It was a bit of tedious work and the precision of the numbers has to be taken with a grain of salt because I/O was concurrently happening on the database and operating system level, probably causing additional extends to be allocated and thus some inaccuracy to a small extent.

The environment consists of the following four datastores, four VDisks and a total number of 43 VMs running various versions of Microsoft SQL Server:

VM Datastore SVC VDisk Datastore Size (GB) Number of VMs
san-ssb-sql-oben-35C sas-4096G-35C 4096 16
san-ssb-sql-unten-360 sas-4096G-360 4096 9
san-ssb-sql-unten-361 sas-2048G-361 2048 8
san-ssb-sql-oben-362 sas-2048G-362 2048 10
Sum 12288 43

Usually the VMs are configured with one or two disks or drives for the OS and one or more disks or drives dedicated to the database contents. In some legacy cases, specifically VM #7, #10 and #12 on datastore “san-ssb-sql-oben-35C”, VM #6 on datastore “san-ssb-sql-unten-360” and VM #8 on datastore “san-ssb-sql-unten-361”, there is also a local application instance – in our case different versions of SAP – installed on the system. The operating systems used are Windows Server 2003, 2008R2 and 2012R2.

For each of the four datastores or VDisks, the results of the tests are shown in the following four sections. Each section consists of a table representation of the datastore as it was being populated with VMs in chronological order. The columns of the tables are pretty self-explainatory, with the most interesting columns being “Rel. Compressed Size”, “Rel. Compression Saving” and “Compression Ratio”. If JavaScript is enabled in your browser, click on a tables column headers to sort by that particular column. Each section also contains a graph of the absolute allocation values (table columns “VM Provisioned”, “VM Used” and “SVC Used”) per VM and a graph of the relative allocation values (values from the table column “SVC Used” divided by values from the table column “VM Used” in %) per VM. The latter ones also feature the “Compression Ratio” metric which is graphed against a second y-axis on the right-hand side of the graphs.

  • Datastore: san-ssb-sql-oben-35C
    VDisk: sas-4096G-35C

    VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment
    Empty Datastore 6.81 MB 6.81 MB
    Number 1 134.11 44.00 20.78 20.10 45.68 54.32 2.19 10.50.4000.0
    Number 2 144.11 55.04 48.45 28.35 51.51 48.49 1.94 11.0.5058.0
    Number 3 94.18 49.79 75.88 27.43 55.09 44.91 1.82 11.0.5058.0
    76.30 0.42
    Number 4 454.11 52.22 92.47 16.17 30.97 69.03 3.23 10.50.4000.0
    Number 5 456.11 391.14 337.74 245.27 62.71 37.29 1.59 10.50.4000.0 SharePoint with file uploads
    Number 6 238.14 156.29 450.89 113.15 72.40 27.60 1.38 10.50.6000.34 HR Applicant Management
    Number 7 488.11 304.32 660.05 209.16 68.73 31.27 1.45 10.50.4000.0 SQL Page Compression
    Number 8 348.09 259.32 703.47 43.42 16.74 83.26 5.97 9.00.4035.00
    Number 9 185.24 149.33 744.68 41.21 27.60 72.40 3.62 10.50.4000.0
    Number 10 292.12 104.31 804.14 59.46 57.00 43.00 1.75 11.0.3349.0 SQL Page Compression
    Number 11 214.11 148.73 882.44 78.30 52.65 47.35 1.90 10.50.2500.0
    Number 12 496.09 463.15 993.53 111.09 23.99 76.01 4.17 9.00.3228.00
    Number 13 202.12 176.96 1058.76 65.23 36.86 63.14 2.71 9.00.4035.00
    Number 14 442.41 363.83 1200.11 141.35 38.85 61.15 2.57 11.0.3412.0 SQL Page Compression
    Number 15 395.38 374.78 1300.00 97.96 26.14 73.86 3.83 11.0.3412.0 SQL Page Compression
    Number 16 739.31 464.30 1520.00 220.00 47.38 52.62 2.11 11.0.3412.0 SQL Page Compression
    Sum or Average 5189.63 3513.51 1497.55 42.10 57.90 2.38

    SVC RTC absolute allocation - Datastore san-ssb-sql-oben-35C; VDisk sas-4096G-35C

    SVC RTC relative allocation - Datastore san-ssb-sql-oben-35C; VDisk sas-4096G-35C

  • Datastore: san-ssb-sql-unten-360
    VDisk: sas-4096G-360

    VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment
    Empty Datastore 5.47 MB 5.47 MB
    Number 1 377.68 241.77 98.13 97.58 40.36 59.64 2.48 11.0.3412.0 SQL Page Compression
    Number 2 439.29 256.54 198.76 100.63 39.23 60.77 2.55 11.0.3412.0 SQL Page Compression
    Number 3 776.76 507.24 448.04 249.28 49.14 50.86 2.03 11.0.3412.0 SQL Page Compression
    Number 4 195.21 154.10 533.52 85.48 55.47 44.53 1.80 11.0.5058.0
    Number 5 34.11 23.91 546.66 13.14 54.96 45.04 1.82 9.00.4035.00
    Number 6 497.11 394.46 718.35 171.69 43.53 56.47 2.30 10.50.1702.0 SQL Page Compression
    Number 7 219.91 53.87 740.20 21.85 40.56 59.44 2.47 11.0.3000.0
    Number 8 276.11 238.67 850.77 110.57 46.33 53.67 2.16 10.50.2500.0
    Number 9 228.11 188.26 986.25 135.48 71.96 28.04 1.39 10.50.2500.0 HR Applicant Management
    Sum or Average 3044.29 2058.82 986.25 47.90 52.10 2.09

    SVC RTC absolute allocation - Datastore san-ssb-sql-unten-360; VDisk sas-4096G-360

    SVC RTC relative allocation - Datastore san-ssb-sql-unten-360; VDisk sas-4096G-360

  • Datastore: san-ssb-sql-unten-361
    VDisk: sas-2048G-361

    VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment
    Empty Datastore 6.66 MB 6.66 MB
    Number 1 188.11 162.98 51.46 51.45 31.57 68.43 3.17 10.50.4000.0
    Number 2 180.31 130.82 90.79 39.33 30.06 69.94 3.33 10.50.4000.0
    Number 3 178.11 142.63 131.17 40.38 28.31 71.69 3.53 10.50.4000.0
    Number 4 156.09 127.26 177.30 46.13 36.25 63.75 2.76 9.00.4035.00
    Number 5 100.86 90.40 202.76 25.46 28.16 71.84 3.55 9.00.4035.00
    205.91 3.15
    Number 6 139.09 79.17 233.85 27.94 35.29 64.71 2.83 11.0.5058.0
    Number 7 34.11 24.16 246.37 12.52 51.82 48.18 1.93 9.00.4035.00
    Number 8 696.32 289.09 402.07 155.07 53.86 46.14 1.86 11.0.5058.0 SQL Page Compression
    Sum or Average 1673.00 1046.51 402.07 38.42 61.58 2.60

    SVC RTC absolute allocation - Datastore san-ssb-sql-unten-361; VDisk sas-2048G-361

    SVC RTC relative allocation - Datastore san-ssb-sql-unten-361; VDisk sas-2048G-361

  • Datastore: san-ssb-sql-oben-362
    VDisk: sas-2048G-362

    VM VM Provisioned (GB) VM Used (GB) SVC Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio SQL Server Version Comment
    Empty Datastore
    Number 1 218.09 37.19 19.93 19.93 53.59 46.41 1.87 9.00.4035.00
    Number 2 122.12 112.53 62.32 42.39 37.67 62.33 2.65 10.50.1600.1
    Number 3 273.62 77.70 97.04 34.72 44.68 55.32 2.24 10.50.4000.0
    Number 4 143.09 37.98 116.47 19.43 51.16 48.84 1.95 11.0.5058.0
    Number 5 137.43 115.29 157.45 40.98 35.55 64.45 2.81 10.50.4000.0
    Number 6 80.09 76.04 178.76 21.31 28.02 71.98 3.57 9.00.4035.00
    182.30 3.54
    Number 7 136.09 117.59 229.95 47.65 40.52 59.48 2.47 9.00.4035.00
    Number 8 194.11 53.55 253.40 23.45 43.79 56.21 2.28 10.50.2500.0
    Number 9 78.11 52.60 273.07 19.67 37.40 62.60 2.67
    Number 10 53.11 31.42 287.33 14.26 45.39 54.61 2.20 10.50.4000.0
    Sum or Average 1435.86 711.89 283.82 40.36 59.64 2.48

    SVC RTC absolute allocation - Datastore san-ssb-sql-oben-362; VDisk sas-2048G-362

    SVC RTC relative allocation - Datastore san-ssb-sql-oben-362; VDisk sas-2048G-362

From the detailed tabular representation of each datastore on a per VM basis that is shown above, an aggregated and summarized view, shown in the table below was created. The columns of the table are again pretty self-explainatory. Some of the less interesting columns from the detailed tables above have been dropped in the aggregated result table. The most interesting columns are again “Rel. Compressed Size”, “Rel. Compression Saving” and “Compression Ratio”. Again, if JavaScript is enabled in your browser, click on a tables column headers to sort by that particular column.

VM Datastore SVC VDisk VM Provisioned (GB) VM Used (GB) SVC Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio
san-ssb-sql-oben-35C sas-4096G-35C 5189.63 3513.51 1497.55 42.10 57.90 2.38
san-ssb-sql-unten-360 sas-4096G-360 3044.29 2058.82 986.25 47.90 52.10 2.09
san-ssb-sql-unten-361 sas-2048G-361 976.68 757.42 243.21 32.53 67.47 3.07
san-ssb-sql-oben-362 sas-2048G-362 1435.86 711.89 283.82 40.36 59.64 2.48
Sum or Average 10646.46 7041.64 3010.83 42.76 57.24 2.34

On the whole, the compression results achived with RTC at SVC level are quite good. Considering the fact that VMware already achieved a sizeable amount of storage reduction through its own thin-provisioning algorithms, the average RTC compression ratio of 2.34 – with a minimum of 1.38 and a maximum of 5.97 – at the SVC is even more impressive. This means that on average only 42.76% of the storage space allocated by VMware is actually used on the SVC level to store the data. Taking VMware thin-provisioning into account, the relative amount of actual storage space needed sinks even further down to 28.28%, or in other words a reduction by a factor of 3.54.

Looking at the individual graphs offers several other interesting insights, which can be indicators for further investigations on the database layer. In the graphs showing the relative allocation values and titled “SVC Real-Time Compression Relative Storage Allocation and Compression Ratio on VMware with MS SQL Server”, those data samples with a high “Rel. Compressed Size” value (purple bars) or a low “Compression Ratio” value (red line) are of particular interest in this case. Selecting those systems from the above results, with an – arbitrarily choosen – value of well below 2.00 for the “Compression Ratio” metric, gives us the following list of systems to take a closer look at:

Sample Datastore / VDisk VM # VM Provisioned (GB) VM Used (GB) SVC Delta Used (GB) Compression Ratio Comment
1 san-ssb-sql-oben-35C sas-4096G-35C 3 94.18 49.79 27.43 1.82 SQLIO benchmark file
2 5 456.11 391.14 245.27 1.59 SharePoint with file uploads
3 6 238.14 156.29 113.15 1.38 HR Applicant Management
4 7 488.11 304.32 209.16 1.45 SAP and MSSQL (with SQL Page Compression) on the same VM
5 10 292.12 104.31 59.46 1.75 SAP and MSSQL (with SQL Page Compression) on the same VM
6 san-ssb-sql-unten-360 sas-4096G-360 4 195.21 154.10 85.48 1.80 SQLIO benchmark file
7 5 34.11 23.91 13.14 1.82
8 9 228.11 188.26 135.48 1.39 HR Applicant Management
9 san-ssb-sql-unten-361 sas-2048G-361 7 34.11 24.16 12.52 1.93
10 8 696.32 289.09 155.07 1.86 SAP and MSSQL (with SQL Page Compression) on the same VM
11 san-ssb-sql-oben-362 sas-2048G-362 1 218.09 37.19 19.93 1.87
12 4 143.09 37.98 19.43 1.95

The reasons for why those particular systems are exhibiting a subaverage compression ratio could be manifold. Together with our DBAs we went over the list and came up with the following, not exhaustive nor exclusive list of explainations which are already hinted in the “Comment” columns of the above table:

  • Samples 1 and 6: Those systems were recently used to test the influence of 4k vs. 64k NTFS cluster size on the database I/O performance. Over the course of these tests, a number of dummy files were written with the SQLIO benchmark tool. Altough the dummy files were deleted after the tests were concluded, the storage blocks remained allocated with apparently less compressible contents.

  • Samples 2, 3 and 8: Those systems host the databases for SharePoint (sample #2) and a HR/HCM applicant management and employee training management system (samples #3 and #8). Both applications seem to be designed or configured to store files uploaded to the application as binary blobs into the database. Both usecases suggest that the uploaded file data is to some extent already in some kind of compressed format (e.g. xlsx, pptx, jpeg, png, mpeg, etc.), thus limiting the efficiency of any further compression attempt like RTC. Storing large amounts of unstructured data together with structured data in a database is subject of frequent, ongoing and probably never ending, controversial discussions. There are several points of view to this question and good arguments are being made from either side. Personally and from a purely operational point of view, i'd favour the unstructured binary data not to be stored in a structured database.

  • Sample 4: This system hosts the legacy design described above, where a local application instance of SAP is installed along Microsoft SQL Server on the same system. The database tables of this SAP release already use the SQL Server page compression feature. Although this could have very well been the cause for the subaverage compression ratio, a look into the OS offered a different plausible explanation. Inside the OS the sum of space allocated to the various filesystems is only approximately 202 GB, while at same time the amount of space allocated by VMware is a little more than 304 GB. It seems that at one point in time there was a lot more data of unknown compressibility present on the system. The data has since been deleted, but the previously allocated storage blocks have not been properly reclaimed.

    In order to put this theory to test, the procedure described in VMware KB 2004155 to reclaim the unused, but not de-allocated storage blocks, was applied. The following table shows the storage allocation for the system in sample #4 before and after the reclaim procedure:

    Sample Datastore / VDisk VM # VM Provisioned (GB) VM Used (GB) SVC Delta Used (GB) Compression Ratio Comment
    4 san-rtc-ssb-sql-unten-366 sas-1024G-366 7 488.11 304.32 209.16 1.45 SAP and MSSQL (with SQL Page Compression) on the same VM - before reclaiming unused storage space
    4 7 488.11 172.53 81.29 2.12 SAP and MSSQL (with SQL Page Compression) on the same VM - after reclaiming unused storage space

    The numbers show that the above assumption was correct. After zeroing unused disk space within the VM and reclaiming unused storage blocks by re-thin-provisioning the virtual disks in VMware, the RTC compression ratio rose from a meager 1.45 to a near average 2.12. Now, only 47.11% – instead of the previous 68.73% – of the storage space allocated by VMware is actually used on the SVC level. Again, taking VMware thin-provisioning into account, the relative amount of actual storage space needed sinks even further down to 16.65%, or in other words a reduction by a factor of 6.00.

  • Samples 5 and 10: These systems basically share the same circumstances as the previously examined system in sample #4. The only difference is, that there is still some old, probably unused data sitting in the filesystems. In case of sample #5 it's about 20 GB of compressed SAP installation media and in case of sample #10 it's about 84.3 GB of partially compressed SAP installation media and SAP SUM update files. Of course we'd like to also reclaim this storage space and – hopefully – in the course of this some deleted, but up to now not de-allocated storage blocks, too. By this we hope to see equally good end results as in the case of sample #4. We're currently waiting on clearance from our SAP team to dispose of some of the old and probably unused data.

  • Samples 7, 9, 11 and 12: Systems in this sample category are probably best described as just being “small”. To better illustrate what is meant by this, the following table shows a compiled view of these four samples. The data was taken from the detailed tabular representation of each datastore on a per VM basis that is shown above:

    Sample Datastore / VDisk VM # VM Provisioned (GB) VM Used (GB) SVC Delta Used (GB) Rel. Compressed Size (%) Rel. Compression Saving (%) Compression Ratio Windows Version
    7 san-ssb-sql-unten-360 sas-4096G-360 5 34.11 23.91 13.14 54.96 45.04 1.82 2003
    9 san-ssb-sql-unten-361 sas-2048G-361 7 34.11 24.16 12.52 51.82 48.18 1.93 2003
    11 san-ssb-sql-oben-362 sas-2048G-362 1 218.09 37.19 19.93 53.59 46.41 1.87 2012 R2
    12 4 143.09 37.98 19.43 51.16 48.84 1.95 2012 R2

    Inside the systems there is very little user or installation data besides the base Windows operating system, the SQL Server binaries and some standard management tools which are used in our environment. VMware thin-provisioning already does a significant amount of space reduction. It reduces the storage space that would be used on the SVC level to approximately 24 GB for Windows 2003 and approximately 37.5 GB for Windows 2012 R2 systems. Although the number of samples is rather low, the almost consistent values within each Windows release category suggest that these values represent – in our environment – the minimal amount of storage space needed for each Windows release. The compressed size of approximately 12.8 GB for Windows 2003 and approximately 19.7 GB for Windows 2012 R2 systems as well as the average compression ratio of approximately 1.89 seem to support this theory in the way that they show very similar values within each Windows release category. It appears that – in our environment – these values are the bottom line with regard to minimal allocated storage capacity. There simply seems not to be enough other data with a good compressability in those systems in order to achieve better results with regard to the overall compression ratio.

By and large, we're quite pleased with the results we're seeing from the SVCs real-time compression feature in our VMware and MS SQL server environment. The amount of storage space saved is – especially with VMware thin-provisioning taken into account – significant. Even in those cases where the SQL servers page compression feature is heavily used, we're still seeing quite good compression results with the use of the additional real-time compression at the SVC level. In part, this is very likely due to the fact that real-time compression at the SVC level also covers the VMs data that is outside of the scope which is covered by the SQL servers page compression. On the other hand, this does not entirely suffice to explain the amount of saved storage space – between 50.86% and 73.86%, if we disregard some special cases for which the reasons of a low compression ratio were discussed above – we're seeing in cases where SQL server page compression is used. From the data collected and shown above, it would appear that the algorithms used in SQL server page compression still leave enough redundant data of low entropy for the algorithms used in the RACE at the SVC level to perform rather well.

In general and with regard to performance in particular we have – up to now – not noticed any negative side effects, like e.g. noticeable increases in I/O latency, by using RTC for the various SQL server and SAP systems shown above.

The use of RTC not only promises, but quite matter of factly offers a significant reduction of the storage space needed. Even with seemingly already compressed workloads, like e.g. SQL server page compressed databases, it deals very well. It thus enables a delayed procurement of additional storage resources, lower operational costs (administration, rack space, power, cooling, etc.), a reduced amount of I/O to the backend storage systems and a more efficient use of tier-1 storage like e.g. flash based storage systems.

It is on the other hand also no out-of-the-box, fire-and-forget solution when used sensibly. The selected examples of “interesting” systems, which were shown and discussed above, illustrates very well that there are always usecases which require special attention. They also point out the increased necessity of a good overall, interdisciplinary technical knowledge or a very good and open communication between the organisational units responsible for application, database, operating system, virtualization and storage administration.

Due to the rather old release level of our VMware environment we unfortunately weren't able to cover the interaction of TRIM, UNMAP, VAAI and RTC. It'd be very interesting to see if and how well those storage block reclaimation technologies work together with the SVCs real-time compression feature.

Comments and own experiences are – as always – very welcome!

Leave a comment…

A G E U D
  • E-Mail address will not be published.
  • Formatting:
    //italic//  __underlined__
    **bold**  ''preformatted''
  • Links:
    [[http://example.com]]
    [[http://example.com|Link Text]]
  • Quotation:
    > This is a quote. Don't forget the space in front of the text: "> "
  • Code:
    <code>This is unspecific source code</code>
    <code [lang]>This is specifc [lang] code</code>
    <code php><?php echo 'example'; ?></code>
    Available: html, css, javascript, bash, cpp, …
  • Lists:
    Indent your text by two spaces and use a * for
    each unordered list item or a - for ordered ones.
This website uses cookies. By using the website, you agree with storing cookies on your computer. Also you acknowledge that you have read and understand our Privacy Policy. If you do not agree leave the website. More information about cookies