VCDX #200 Blog of one VMware Infrastructure Designer: Storage DRS Design Considerations

This blog post follows blog post "VMware vSphere SDRS - test plan of SDRS initial placement" and summarizes several facts having an impact on SDRS design decisions. If you want to see results of several SDRS tests I did in my home lab read my previous blog post.

SDRS design considerations:

SDRS Initial Placement algorithm does NOT take VM swap file capacity into account. However, Subsequent Rebalance Calculations are based on free space on particular datastores therefore if Virtual Machines are in PowerOn state and swap files exist then it is considered because VM swap file usage will be decreased from datastore total free space and it has an impact on space load. Space load formula is [space load] = [total consumed space on the datastore] / [datastore capacity]
Storage Space threshold is just a threshold (soft limit) used by SDRS for balancing and defragment. It is not a hard limit. SDRS is trying to keep free space on datastores based on space threshold but SDRS doesn't guarantee you will have always some amount of free space in datastores. See. Test-3 in my SDRS test plan here [6].
SDRS defragmentation works but there can be some cases when initial placement fails even the storage was freed up and there will be free continuous space in some datastore after defragmentation. See. Test-2 in my SDRS test plan here [6]. That's up to the component which does provisioning. It is important to understand how provisioning to datastore cluster really works. A lot of people think that Datastore Cluster behaves like a giant datastore. It is true from high-level view (abstracted view) but in reality Datastore Cluster is nothing else then just a group of single datastores where SDRS is "just" a scheduler on top of Datastore Cluster datastores. You can imagine a scheduler as a placement engine which prepares placement recommendations for initial placement and continuous balancing. That means that other software components (C# Client, Web Client, PowerCLI, vRealize Automation, vCloud Director, etc) are responsible for initial placement provisioning and SDRS give them recommendations where is the best place to put a new storage object (vmdk file or VM config file). Here [8] is the proof of my statement.
When SDRS is configured to consider I/O metrics for load balancing then it is considered also during initial placement. Please, do not mix up SDRS I/O metrics and SIOC. These are two different things even SDRS I/O metrics are leveraging normalized latency calculated by SIOC.
Q: Do I need to use SDRS I/O metrics for load balancing? A: It depends on your physical storage system (disk array). If you have disk array with modern storage architecture then you will have most probably all datastores (aka LUNs, volumes) on single physical disk pool. In that case, it doesn't make sense to load balance (do storage vMotion in case of I/O contention) between datastores because it will always end up in same physical spindles anyway and on top of that it will generate additional storage workload. The same is true for initial placement. If you have your datastores on different physical spindles then it can help. This is typically used on storage systems using RAID groups which is not very common nowadays.
SDRS calculation is done on vmdks, not the whole Virtual Machine. But affinity rules (keep together) tend to keep the vmdks together making it similar behavior as if the was VM. By default, virtual machine files are kept together in the working directory of the virtual machine. If the virtual machine needs to be migrated, all the files inside the virtual machines’ working directory are moved. However, if the default affinity rule is disabled (see. Screenshot 1), Storage DRS will move the working directory and virtual disks separately allowing Storage DRS to distribute the virtual disk files on a more granular level.
Q: When and how often is SDRS rebalancing kicked in? A: Rebalancing happens 1) at regular interval (default 8 hours - it can be changed see Screenshot 2); 2) when threshold violation is detected like above; 3) user requests a configuration change 4) API call like clicking run SDRS via a client.
Multiple VM provisioning can behave differently - less deterministically - because of other SDRS calculation factors (I/O load, capacity usage trend) and also because of particular provisioning workflow and exact timing when SDRS recommendation is called and when datastore space is really consumed. Recall that datastore reported free capacity is one of the main factors for next SDRS recommendations.
From vSphere 6.0 SDRS is integrated with other relevant technologies like SDRS VASA awareness (array-based thin-provisioning, deduplication, auto-tiering, snapshot, replication), Site Recovery Manager (consistency groups and protection groups considerations), vSphere Replication (replica placement recommendations), Storage Policy Based Management (moves just between same compliant VM storage policies). For further details read reference [9].

The general solution to overcome challenges highlighted above (architects call it risk mitigation) is to have "enough" free space on each datastore. Enough free storage space per datastore gives some more flexibility to SDRS algorithm. What is "enough" and how to achieve it is out of the scope of this blog post but think about following hints

Thin Provisioning on physical storage. See. Frank Denneman's note here [2] about thin provisioning alarm which should be considered by SDRS algorithm. I'm writing should because I had no chance to test it.
VVOLs, VSAN - I believe that one big VVOLs datastore (storage container) eliminate the need for datastore free space considerations.

But don't forget that there are always other considerations with potential impacts to your specific design so think holistically and use critical thinking during all design decisions.

Last but not least - I highly encourage you to study carefully the book "VMware vSphere Clustering Deepdive 5.1" to be familiar with basic SDRS algorithm and terminology.

Screenshots: