Thursday, December 17, 2015

End to End QoS solution for Vmware vSphere with NSX on top of Cisco UCS

I'm engaged on a private cloud project where end to end network QoS is required to achieve some guarantees for particular network traffics.  These traffics are
  • FCoE Storage
  • vSphere Management
  • vSphere vMotion
  • VM production
  • VM guest OS agent based backup <== this is the most complex requirement in context of QoS
Compute and Network Infrastructure is based on
  • CISCO UCS
  • CISCO Nexus 7k and 
  • VMware NSX. 
More specifically following hardware components has to be used:
  • CISCO UCS Mini Chassis 5108 with Fabric Interconnects 6324 
  • CISCO UCS servers B200 M4 with virtual interface card VIC1340 (2x10Gb ports - each port connected to different fabric interconnect)
  • CISCO Nexus 7k
Customer is also planning to use NSX security (micro segmentation) and vRealize Automation for automated VM provisioning.

So now we understand the overall concept and we can consider how to achieve end to end network QoS to differentiate required network traffics. 

Generally we can leverage Layer 2 QoS 802.1p (aka CoS - Class of Service ) or Layer 3 QoS (aka DSCP - Differentiated Services Code Point). However, to achieve end to end QoS on Cisco UCS we have to use CoS because it is the only QoS method available inside Cisco UCS blade system to guarantee bandwidth in shared 10Gb NIC (better to say CNA) ports.

The most important design decision point is where we will do CoS marking to differentiate required network traffics. Following two options are generally available:
  1. CoS marking only in Cisco USC (hardware based marking) 
  2. CoS marking on vSphere DVS portgroups (software based marking)
Lets deep dive into available options.

OPTION 1 - UCS Hardware CoS marking

Option 1 is depicted on figure below. Please, click on figure to understand where CoS marking and bandwidth management is done.


Following bullets describes key ideas of option 1:
  • Management and vMotion traffics are consolidated on the same pair of 10Gb adapter ports (NIC-A1 and NIC-B1) together with FCoE traffic. Active / Standby teaming is used for vSphere Management and vMotion portgroups where each traffic is by default active on different UCS fabric.
  • VTEP and Backup traffics are consolidated on the same pair of 10Gb adapter ports (NIC-A2 and NIC-B2). Active / Standby teaming is used for NSX VTEP and backup portgroup.  Each traffic is by default active on different UCS fabric. 
  • Multiple VTEPs and Active / Active teaming for backup portgroup can be considered for higher network performance if necessary.
  • All vmkernel interfaces should be configured consistently across all ESXi hosts to use the same ethernet fabric in non-degraded state and achieve optimal east-west traffic.
Option 1 implications:
  • Two Virtual Machine's vNICs has to be used because one will be used for production traffic (VXLAN) and second one for backup traffic (VLAN backed portgroup with CoS marking).
OPTION 2 - vSphere DVS CoS marking:

Option 2 is depicted on figure below. Please, click on figure to understand where CoS marking and bandwidth management is done.

Following bullets describes key ideas of Option 2 and difference against Option 1:
  • Management and vMotion traffics are also consolidated on the same pair of 10Gb adapter ports (NIC-A1 and NIC-B1) together with FCoE traffic. The difference against Option 1 is usage of single CISCO vNIC and CoS marking in DVS portgroups. Active / Standby teaming is used for vSphere Management and vMotion portgroups where each traffic is by default active on different UCS fabric.
  • VTEP and Backup traffics are consolidated on the same pair of 10Gb adapter ports (NIC-A2 and NIC-B2). Active / Standby teaming is used for NSX VTEP and backup portgroup.  The difference against Option 1 is usage of single CISCO vNIC and CoS marking in DVS portgroups. Each traffic (VXLAN, Backup) is by default active on different UCS fabric. 
  • Multiple VTEPs and Active / Active teaming for backup portgroup can be considered for higher network performance if necessary.
  • All vmkernel interfaces should be configured consistently across all ESXi hosts to use the same ethernet fabric in non-degraded state and achieve optimal east-west traffic.
  • FCoE traffic is marked automatically by UCS for any vHBA. This is the only hardware based CoS marking.
Option 2 implications:
  • Two Virtual Machine's vNICs has to be used because one will be used for production traffic (VXLAN) and second one for backup traffic (VLAN backed portgroup with CoS marking).

Both options above requires two vNICs per VM. It is introducing several challenges some of them listed below:
  1. More IP addresses required
  2. Default IP gateway is used for production network, therefore, the backup network cannot be routed without static routes inside guest OS. 
Do we have any possibility to achieve QoS differentiation between VM traffic and In-Guest backup traffic with single vNIC per VM? 

We can little bit enhance Option 2. Enhanced solution is depicted in the figure below. Please, click on figure to understand where CoS marking and bandwidth management is done.


So where is the enhancement? We can leverage enhanced conditional CoS marking in each DVS portgroup used as NSX virtual wire (aka NSX Logical Switch or VXLAN). If IP traffic is targeted to the backup server we will mark it as CoS 4 (backup) else we will mark it a CoS 0 (VM traffic). 

You can argue VXLAN is L2 over L3 and thus L2 traffic where we did a conditional CoS marking will be encapsulated into L3 traffic (UDP) in VTEP interface and we will lose CoS tag. However, that's not the case. VXLAN protocol is designed to keep L2 CoS tags by copying inner Ethernet header into outer Ethernet header. Therefore virtual overlay CoS tag is kept even in physical network underlay and it can be leveraged in Cisco UCS bandwidth management (aka DCB ETS - Enhanced Transmission Selection) to guarantee bandwidth for particular CoS traffics.

The enhanced Option 2 seems to me as the best design decision for my particular use case and specific requirements.

I hope it makes sense and someone else find it useful. However, I share it with IT community for broader review. Any comments or the thoughts are very welcome.

3 comments:

Anonymous said...

Thanks David for putting together a nice details blog.

I have a small question, in regards to Option 03: i still need to configure QoS policy in every port-group.

Will it possible to configure NSX to honor any QoS, which marked by VM/Application?

Apologise if i over simplified the question.

David Pasek said...

Hi.

If I understand your question correctly you would like to do 802.1p marking (COS) inside VM/Application (GuestOS level). Am I right?

To be honest, I do not have any hands-on experience with such 802.1p setup but first of all you would need to setup VGT mode because 802.1p is inside VLAN (802.1Q). For more info how to setup it check KB article https://kb.vmware.com/kb/1004252 [Sample configuration of virtual machine VLAN Tagging (VGT Mode) in ESX]

However, I do not think there is a possibility to choose COS priority in guest OS. I can be wrong here.

But if you think about it, it does not really make sense to allow VM guest OS admins to choose network priority. It has to be controlled on network level where scheduling and sharing occurs. Therefore, vSwitch or pSwitch is the right place to do it.

Just my $0.02,
David.

David Pasek said...

I have just thought about my previous answer to COS priority in guest OS and it is obviously possible to do it inside the guest OS but the question is if it will be trusted and accepted by VDS (VMware Distributed Virtual Switch).