How Virtual Machine data is stored and managed within a HPE SimpliVity Cluster – Part 6 – Calculating backup capacity

Posted by

So far in this series, we have discussed how the HPE SimpliVity platform intelligently places vm’s and their associated data containers, and how these data containers are automatically provisioned and distributed across a cluster of nodes.

I have talked about data containers with regards to virtual machines, as in, data containers are constructs for holding all data (vm config and flat files) associated with a virtual machine as illustrated below 

data-container-contents


But what about backups ? how are they stored and managed ? Firstly lets look briefly look at the backup mechanism and types….

HPE SimpliVity Backups

HPE SimpliVity backups are independent snapshots of a virtual machine’s data container (at the time of backup) and are independently stored in the HPE SimpliVity object store.

They are not mounted to the NFS datastore until needed. These backups are stored in HA pairs and are kept on the same nodes as the virtual machine’s primary and secondary data containers to maximize de-duplication of data.

backups

This post will not explain how they are logically independent as explaining the concept would require a separate post in itself and will lead to other questions in regards to capacity management I often get asked. In my next post I will look to explain how data containers are logically independent from each other which will lead us naturally into space reclamation scenario’s

HPE SimpliVity Backup Mechanism

The snapshot process is handled natively by the HPE SimpliVity platform  

  1. A point in time Data Container (Snapshot) is created locally.
  2. This is done by creating a new reference to the data container and incrementing reference counts in the Object Store of the source data container.
  3. No data writes are incurred, just metadata and index update operations.
  4. The extra reference allows for full de-duplication  

Backup Types

The HPE SimpliVity platform supports three flavors of snapshots and regardless of the backup type the end result is always the same, a HPE SimpliVity snapshot will be taken.

Crash consistent – The standard HPE SimpliVity backup. This is a snapshot of the Data Container at the time the backup is taken. This backup consists of only metadata and index update operations. No I/O to disk incurred making them nearly instantaneous.

Application consistent – This methods leverages VMware Tools to firstly quiesce the VM prior to taking a HPE SimpliVity snapshot (default for Application Consistent option). Select the Application Consistent option when opting for application consistent backups (when creating or editing a policy rule)

  • Firstly, a VMware Snapshot is taken. Outstanding I/O flushed to the disk via VMware tools commands. The VM is now running off a snapshot.
  • A Point in time HPE SimpliVity snapshot is taken of the Data Container (Standard HPE SimpliVity Backup)
vCenter Tasks highlighting the VMware Snapshot Process (Note: this process completes completes before HPE SimpliVity Snapshot is taken)

Note: The restored VM will also contain the VMware Snapshot file (as we take the HPE SimpliVity Snapshot after the VMware Snapshot is taken) thus, the VM should be reverted prior to powering on to ensure the VM is fully consistent (as any I/O performed after the VMware snapshot is was not flushed)

Remember to revert any VMware Snapshots when restoring a HPE SimpliVity Application Consistent Backup

Application consistent with Microsoft VSS – Standard HPE SimpliVity Backup with Microsoft Volume Shadow Copy Service (VSS)

MS-VSS uses application-specific writers to freeze an application and databases, and ensures that all the data transferring from one file to another is captured and consistent before the HPE SimpliVity backup is taken.

SQL Server 2008 R2, 2012, 2014 and 2016 are supported VSS applications. and Windows Server 2008 R2, 2012 R2, and 2016 are supported VSS operating systems.

Guidelines and full steps on how to enable (and restore) VSS backups are outlined in the HPE SimpliVity administration guides

Select the Microsoft VSS option when opting for application-consistent backups (when creating or editing a policy rule)

Save windows credentials to the virtual machine(s) that use the policy using the SimpliVity “Save Credentials for VSS” option. The credentials you provide should be also be added to the local machine’s administrator group.

These credentials are used to copy and run VSS scripts to freeze the application and databases before the HPE SimpliVity backup is taken.

Backup Consumption – Viewing Cluster Capacity

To view cluster capacity navigate to the Cluster object and under the Monitor tab, choose HPE SimpliVity Capacity.

Within the GUI, physical backup consumption is not split from overall physical consumption i.e. physical consumption is a combination of local, remote and virtual machine backup data.

Logical consumption is listed, you can think as logical consumption as the physical space required to store all backups with no de-duplication.

Within the GUI it is possible to view individual unique backup sizes by selecting any HPE SimpliVity backup and under backup Actions, selecting the Calculate Unique Backup Size option.

Once complete scroll to the right of the backup screen to the column – Unique Backup to view.

This method can be useful, however individual node consumption figures are not listed within the vSphere plugin / GUI. If you wish to view physical consumption information per node you will need to resort to the cli

Backup Consumption – viewing individual node utilization

The command ‘dsv-backup-size-estimate‘ can be a useful tool if you are looking to break out individual node backup consumption figures. This command will list total node on-disk consumption, local backup and remote backup on-disk consumption figures for all backups contained on the node.

The command has many options, using the –node switch will list all backups contained on the node. Using the –vm along with –datastore will allow you to specify an individual vm if required. The command can be run from any node to interrogate itself or any other node.

listed below the output when running the command using the –node switch (I shortened it to show just two virtual machines)

The bottom of the output will display node usage statistics. For OmniStackVC-10-6-7-225 in this example, total Node on-disk consumed data is 625.05 GB.

This is comprised of Data (on disk VM consumption) of 142.17 GB + Local backup consumption of 268.15 GB + Remote backup consumption of 214.73 GB

root@omnicube-ip7-203:/home/administrator@vsphere# dsv-backup-size-estimate --node OmniStackVC-10-6-7-225

.---------------------------------------------------------------------------------------------------------------------.
| Name                                          | Bytes    | Local   | Offnode | Heuristic | Backup                   |
|                                               | Written  | Size    | Size    |           | Time                     |
+-----------------------------------------------+----------+---------+---------+-----------+--------------------------+
| Windows      (TEST-DS)                        |          |         |         |           |                          |
+-----------------------------------------------+----------+---------+---------+-----------+--------------------------+
| Windows-backup-2019-06-19T11:11:46+01:00      |  12.06GB | 12.06GB |         |    2.24GB | Wed Jun 19 10:14:41 2019 |
| Windows-backup-2019-06-19T11:12:10+01:00      | 360.37KB | 12.06GB |         |   66.85KB | Wed Jun 19 10:15:06 2019 |
| Test                                          |   2.79GB | 12.06GB |         |  529.13MB | Mon Jun 24 05:07:14 2019 |
+-----------------------------------------------+----------+---------+---------+-----------+--------------------------+
| VM Total Backups: 3 of 3                      |  14.84GB | 36.18GB |      0B |    2.75GB |                          |
| = local 3 + offnode 0, unknown 0              |          |         |         |           |                          |
'-----------------------------------------------+----------+---------+---------+-----------+--------------------------'


.----------------------------------------------------------------------------------------------------------------------.
| Name                                          | Bytes    | Local    | Offnode | Heuristic | Backup                   |
|                                               | Written  | Size     | Size    |           | Time                     |
+-----------------------------------------------+----------+----------+---------+-----------+--------------------------+
| ELK_Ubutnu (DR_1TB)                           |          |          |         |           |                          |
+-----------------------------------------------+----------+----------+---------+-----------+--------------------------+
| 2019-06-20T00:00:00+01:00                     | 133.88GB | 133.88GB |         |   24.33GB | Wed Jun 19 23:00:00 2019 |
| 2019-06-21T00:00:00+01:00                     |  98.91GB | 134.27GB |         |   17.97GB | Thu Jun 20 23:00:00 2019 |
+-----------------------------------------------+----------+----------+---------+-----------+--------------------------+
| VM Total Backups: 2 of 2                      | 232.79GB | 268.15GB |      0B |   42.30GB |                          |
| = local 2 + offnode 0, unknown 0              |          |          |         |           |                          |
'-----------------------------------------------+----------+----------+---------+-----------+--------------------------'

For node OmniStackVC-10-6-7-225 (32f62c42-6b8f-8529-c6a5-9db972eee181) in cluster 'DR Cluster':
- Total Node Data:  625.05GB = Data: 142.17GB  + Local backup: 268.15GB + Remote backup: 214.73GB
- Compressed:        113.58GB, Uncompressed: 207.95GB
- Comp  ratio:       0.54617
- Dedup ratio:       0.33269
- Total number of backups: 15
- Number of backups shown: 15 = Local 15 + Offnode 0, unknown 5
- Total backup bytes:      482.88GB = local 482.88GB + offnode 0B
- Gross heuristic estimate: 59.60GB
root@omnicube-ip7-203:/home/administrator@vsphere#

Backup Consumption – viewing individual backup consumption

Focusing on the virtual machine backup consumption output above, I have chosen two very different types of virtual machines for comparison purposes.

Virtual machine ‘Windows‘ in-guest data de-duplicates very well while virtual machine ‘ELK_Ubutnu‘ contains a database workload that generates a high amount unique data, thus each individual backup will contain a larger amount of unique data. Lets compare the figures.

For each individual virtual machine 4 main columns will be listed. However for on-disk consumption purposes we only really care about the Heuristic values.

  • Bytes Written – While I have marked this as ‘Logical Bytes Written’ this is a combination of unique on-disk virtual machine size and unique backup size at time of backup. For the very first backup this will capture the unique size of the VM, however no I/O is actually incurred (hence I marked it as logical!). This can be a useful value for calculating the unique on-disk consumption of a virtual machine.
  • Local Size – Logical backup size (as exposed in the GUI)
  • Off-node Size – Weather the backup is local or remote.
  • Heuristic – An approximation of unique on-disk data associated with this backup and the VM since the time of backup. This value is cumulative from when the backup was first taken.

Virtual Machine – ‘Windows

Looking at the values for this virtual machine’s backups we can deduce the following.

  1. The approximate unique size of this VM was originally 12 GB. If we were to delete all backups associated with this virtual machine, approx 12 GB of data would still be consumed on this node.
  2. Since the first backup was taken, approx 2.75GB of unique data is associated with the virtual machine and its backups.
  3. Deleting all backups would reclaim no more than 2.75 GB of data, however as that data is shared with the virtual machine since time of backup, less could be reclaimed in reality. To obtain a truer value, locate the backup within the GUI and calculate its unique size as outlined in the further back in the post.

Virtual Machine – ELK_Ubuntu

Virtual Machine ELK_Ubuntu contains much more unique data. This is reflected via its individual and aggregate backup figures of 24GB, 17GB (Totaling 42GB)

  1. The approximate unique size of this VM was originally 133 GB. If we were to delete all backups associated with this virtual machine, approx 133 GB of data would still be consumed on this node.
  2. Since the first backup was taken, approx 42 GB of unique data is associated with the virtual machine and its backups.
  3. Deleting all backups would reclaim no more than 42 GB of data

The command dsv-backup-size-estimate can be a useful tool, however, bear in mind that it is a point in time calculation when run and should be treated as such. As data is written to and deleted from the Object Store, de-duplication rates may grown or shrink in size. As could data associated with the original.

I find these values are useful for gaining an approximation of virtual machine and associated backup consumption and from there working within the GUI to identify backups that may be starting to grow too large !

One comment

  1. Hi. This looks exactly like what i’m after to work out whats chewing up all our space. I dont seem to have the command dsv-backup-size-estimate available though. I’ve found this one but it only works on a single backup
    svt-backup-size-calculate –datastore datastore –vm vm –backup backup — datacenter datacenter –cluster cluster [common-options]

    How do i get the dsv-backup-size-estimate you reference ?

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.