Sunday, July 13, 2014

Heads Up! VMware virtual disk IOPS limit bad behavior in VMware ESX 5.5

I've been informed about strange behavior of  VM virtual disk IOPS limits by one my customer for whom I did vSphere design recently. If you don't know how VM vDisk IOPS limits can be useful in some scenarios read my another blog post - "Why use VMware VM virtual disk IOPS limit?". And because I designed this technology for some of my customers they are very impacted by bad vDisk IOPS limit behavior in ESX 5.5

I've tested VM IOPS limits in my lab to see it by myself. Fortunately I have two labs. Older vSphere 5.0 lab with Fibre Channel Compellent storage and newer vSphere 5.5 lab with iSCSI storage EqualLogic. First of all let's look how it works in ESX 5.0. Same behavior is in ESX 5.1 and this behavior make perfect sense.

By default VM vDisks doesn't have limits as seen on next screen shot.


When I run IOmeter with single worker (thread) on unlimited vDisk I can achieve 4,846 IOPS. That's what datastore (physical storage) is able to give to single thread.


When I run IOmeter with two workers (threads) on unlimited vDisk I can achieve 7,107 IOPS. That's ok because all shared storages have implemented algorithms to limit performance for threads. That's actually protection against single thread abuse of all storage performance.


Now let's try to setup SIOC to 200 IOPS limits on both vDisks on VM as depicted on  picture below.  


Due to settings above IOmeter single worker generated workload is limited to 400 IOPS (2 x 200) per whole VM because all limit values are consolidated per virtual machine per LUN. For more info look at http://kb.vmware.com/kb/1038241. So it behaves as expected because IOmeter IOPS was oscillating between 330 and 400 IOPSes as you can see in picture below.


We can observe similar behavior with two workloads.


So in ESX 5.0 lab everything works as expected. Now let's move to another lab where I have vSphere 5.5. There is iSCSI storage so first of all we will run IOmeter without vDisk IOPS limits to see maximal performance we can get. On picture below we can see that single thread is able to get 1741 IOPSes.


... and two workers can get 3329 IOPSes.


So let's setup vDisk IOPS limits to 200 IOPS limits on both vDisks on VM as in test on ESX 5.0. I have also 2 disks on this VM. Due to these settings IOmeter single worker generated workload should be also limited to 400 IOPS (2 x 200) per whole VM. But unfortunately it is not limited and it can get 2000 IOPSes. It is strange and in my opinion bad behavior.

 

But even worse behavior can be observe when there are more threads. In examples below you can see two and four workers (threads) behavior. VM is getting really slow performance.



ESX 5.5 VM vDisk behavior is really strange and because all typical OS storage workloads (even OS booting) are multi-threaded than VM vDisk IOPS limits technology is unusable. My customer has opened support request so I believe it is a bug and VMware Support will help to escalate it in to VMware engineering.

UPDATE 2014-07-14 (Workaround): 
I've tweet about this issue to   and Duncan moved me immediately in to the right direction.  He reveal me the secret ... ESXi has two disk schedulers old one and new one (aka mClock).  ESXi 5.5 uses new one (mClock) by default. If you switch back to the old one, disk scheduler behaves as expected.  Below is the setting how to switch to the old one.

Go to ESX Host Advanced Settings and set Disk.SchedulerWithReservation=0

This will switch back to the old disk scheduler.

Kudos to Duncan.

Switching back to old scheduler is good workaround which will probably appear in VMware KB but there is definitely some reason why VMware introduced new disk scheduler in 5.5. I hope we will get more information  from VMware engineering so stay tuned for more details ...

UPDATE 2015-12-3:
Here are some references to more information about disk schedulers in ESXi 5.5 and above ...

ESXi 5.5
http://www.yellow-bricks.com/2014/07/14/new-disk-io-scheduler-used-vsphere-5-5/
http://cormachogan.com/2014/09/16/new-mclock-io-scheduler-in-vsphere-5-5-some-details/
http://anthonyspiteri.net/esxi-5-5-iops-limit-mclock-scheduler/

ESXi 6
http://www.cloudfix.nl/2015/02/02/vsphere-6-mclock-scheduler-reservations/

No comments: