Sunday, August 17, 2014

Network communications between virtual machines

I was contacted by colleague of mine who pointed to very often mentioned statement about network communication between virtual machines on the same ESXi host. One of such statement is cited below.
"Network communications between virtual machines that are connected to the same virtual switch on the same ESXi host will not use the physical network. All the network traffic between the virtual machines will remain on the host."
He was discussing this topic within his team and even they are very skilled virtualization administrators they had doubts about real behavior. I generally agree with statement above but it is actually correct statement only in specific situation when virtual machines are in the same L2 segment (the same broadcast domain - usually VLAN).

Figure 1 - L3 routing on physical network
I've prepared the drawing above to explain real behavior clearly. Network communication between VM1 and VM2 will stay on the same ESXi host because they are in the same L2 segment however communication between VM1 and VM3 has to go to physical switch (pSwitch)  to be routed between VLAN 100 and VLAN 200 and return back to ESXi host and VM3.

Discussed statement above can be slightly reformulated to be always correct.
"Network communications between virtual machines that are connected to the same virtual switch portgroup on the same ESXi host will not use the physical network. All the network traffic between the virtual machines will remain on the host."
Both VMware standard and also distributed vSwitch are dump L2 switches so L3 routing must be done somewhere else, typically on physical switches. However there can be two scenarios when even L3 traffic between virtual machines on  same ESXi host can stay there and not use physical network.

First scenario is when L3 routing is done on virtual machine running on top of the same ESXi host. Examples of such virtual routers are VMware's vShield Edge, Brocade's  Vyatta, CISCO's CSR, open source router pfSense, or some other general OS with routing services. This scenario, also known as network function virtualization, is depicted on Figure 2.

Figure 2 - L3 routing on virtual machine (Network Function Virtualization)
It is worth to mention that L3 traffic between VM5 and VM6 will go through physical network because L3 router is on another ESXi host.

Second scenario is when distributed virtual router like VMware's NSX is used. This scenario is depicted on Figure 3. In this scenario, L2 and L3 traffic of all virtual machines running on same ESXi host is optimized and will remain on the host without physical network usage.

Figure 3 - Distributed Virtual Routing (VMware NSX)

So in our particular scenario L2 and L3 network communication among VM1, VM2, VM3 and VM4 will stay on the same ESXi host. The same apply to VM5 and VM6.

Hope, I've covered all possible scenarios and this blog post will be helpful to others during similar discussions in virtualization teams. And as always, comments are very welcome.

2 comments:

Anonymous said...

Hello David,

Thank you for the blog. It helped clarify requirement for the physical switch that we were looking into.

I am wondering if you can also clarify as to what happens to the traffic when SR-IOV is enabled? In what situations the traffic is seen by the virtual switch and when it is seen by physical switch.

Thanks again for a great article.

David Pasek said...

Hello Anonymous :-)

Thanks for comment

Although SR-IOV is another topic it doesn't change anything written in this blog post. L3 network traffic between VMs on the same ESXi host has to go out of VMware vSwitch to be routed. The only exception is if there is local routing service on particular ESXi host.

What SR-IOV is literally doing is that it creates virtual PCI function (VF) on top of single physical PCI function (PF). VFs are connected directly to VM appearing in the Guest OS as PCI NIC devices. VFs Ethernet traffic is switched locally in the SR-IOV enabled NIC and PF is uplink to the physical network. So in this case you have more switches down the path.

The question is ... do you really have requirements to implement SR-IOV. IMHO there are limited use cases for SR-IOV and it brings you lot of constraints and limits.