Wednesday, April 11, 2018

How to disable Spectre and Meltdown mitigations?

Today, I have been asked again "How to disable Spectre and Meltdown mitigations on VMs running on top of ESXi". Recently I wrote about Spectre and Meltdown mitigations on VMware vSphere virtualized workloads here.

So, let's assume you have already applied patched and updates to ...
  • Guest OS (Windows, Linux, etc.)
  • Hypervisor - ESXi host (VMSA-2018-0004.3 and  VMSA-2018-0002)
  • BIOS (version having support for IBRS, IBPB, STIBP capabilities)
... therefore, you should be protected against Spectre and Meltdown vulnerabilities known as CVE-2017-5753 (Spectre - Variant 1), CVE-2017-5715 (Spectre - Variant 2), and CVE-2017-5754 (Meltdown - Variant 3).

These security mitigations do not come for free. They have a significant impact on performance. I did some testing in my lab and some results were scaring me. The biggest impact is on workloads having system calls (calls from OS userland to the OS kernel) such as memory, network, and storage I/O operations. The performance impact is the reason why some administrators and application owners are willing to disable security mitigation in systems where interprocess communication is trusted and potential data leaks between them is not a problem. 

So, let's answer the question. Spectre and Meltdown mitigations can be disabled on Guest Operating System level. This is the preferred method.

RedHat

You can disable security mitiggations at runtime with the following three commands. The change is immediately active and does not require a reboot.

    # echo 0 > /sys/kernel/debug/x86/pti_enabled
    # echo 0 > /sys/kernel/debug/x86/retp_enabled
    # echo 0 > /sys/kernel/debug/x86/ibrs_enabled

this is not persistent 

MS Windows
In Windows operating system you can control it via the registry.

To enable the mitigation you had to change Registry Settings
  • reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverride /t REG_DWORD /d 0 /f
  • reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management" /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f
  • reg add "HKLM\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Virtualization" /v MinVmVersionForCpuBasedMitigations /t REG_SZ /d "1.0" /f
  • Restart the server for changes to take effect.
and to disable the mitigation you have to change Registry Settings as follows
  • reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 3 /f
  • Restart the server for changes to take effect.
 Note: After any change, please, test if your system behaves as expected (secure or not secure).

ESXi - not recommended method

Another method is to disable CPU features on ESXi level. This is not recommended by VMware but in VMware KB 52345 it was published recently as a workaround.  

Here is the procedure how to mask CPU capabilities on ESXi level.

Step 1/ Login to each ESXi host via SSH.

Step 2/ Add the following line in the /etc/vmware/config file:
cpuid.7.edx = "----:00--:----:----:----:----:----:----"

Step 3/ run command /sbin/auto-backup.sh to backup config file and keep the configuration change persistent across ESXi reboot

Step 4/ Power-cycle VMs running on top of ESXi host 

This will hide the speculative-execution control mechanism for virtual machines which are power-cycled afterward on the ESXi host.  So, you have to Power-cycle virtual machines on the ESXi host. Rebooting of the ESXi host is not required. The effect is that the speculative execution control mechanism is no longer available to virtual machines even if the server firmware provides the same microcode independently.

Conclusion

It is important to mention that Guest Operating System inside VM may or may not use CPU Capabilities IBRS, IBPB, STIBP provided by CPU microcode to mitigate security issues. As far as I'm aware these instructions are leveraged by Guest OSes just to mitigate only Spectre Variant 2 (CVE-2017-5715). In some cases, Guest OS can use some other mitigation methods even for Spectre Variant 2. For example, linux kernel is currently trying to leverage “Retpoline” code sequences to decrease the performance impact but “Retpoline” is not applicable for all CPU models. So, there is no single recommendation which would fit all situations.

That's the reason why performance tuning by disabling security enhancements should be always done on Guest Operating System level and not on ESXi level. ESXi workaround is just a workaround which can be useful in case some new bug in CPU microcode will be discovered but performance is always handled by Guest OS.

No comments: