Wednesday, October 20, 2021

Kubernetes vSphere CSI Driver

The main reason why I do blogging is to document some technical details and design patterns I discuss with my customers. Usually, I decide to write the blog post about some topic, when there are more then two customers wanting to know some technical details or experiencing some technical challenge.

Today I will write a first blog about Kubernetes. It seems to me that Kubernetes has finally reached the momentum and everybody is trying to jump into the wagon. It is obvious, that Kubernetes is the infrastructure platform for modern distributed applications. VMware has recognized this trend very early and integrated Kubernetes into VMware vSphere platform, also known as Tanzu. I do not want to describe Tanzu platform from product perspective because there are plenty of such blog posts across the blogosphere. Cormac Hogan is my favorite Tanzu/Kubernetes blogger, probably because in the past he was blogging about vSphere and storage related topics. Therefore, if you want to get some info about VMware Tanzu, I highly recommend Cormac's blog which is available at https://cormachogan.com/.

In this article, I would like to describe the architecture overview of vSphere CSI Driver and some process flow behind the scene.

Disclaimer: Please note that this is just my personal understanding how it works and some things can be inaccurate or at very high detail. Nevertheless, if you believe there is something totally wrong, speak up in comments below the article.

First thing first, I'm the visual guy therefore let's start with overall solution architecture.


 The DevOps process to create persistent volume is following

  • DevOps Admin will ask Kubernetes cluster to create persistent volume via kubectl and YAML manifest (aka persistent volume claim)
  • CSI driver has control plane in K8s supervisor and CSI Driver agents on all K8s worker nodes
  • DevOps Admin request (claim) of persistent volume is sent to CSI driver control plane
  • CSI driver control plane is integrated with vCenter server via vSphere API
  • CSI driver control plane via vCenter API asks vSphere to create storage volume.
  • Storage volume can be VMDK file on VMFS filesystem, vSAN object, vVol (lun on physical storage) or NFS shared storage (mountpoint).
  • vCenter will create such storage volume via some ESXi host
  • CSI driver control plane can leave such storage volume unattached (aka FCD - First Class Disk) or it can attach the storage volume into particular ESXi host because eventually it knows into which K8s pod (container) such volume should be attached. And it also knows in which K8S Worker Node (linux guest os on top of virtual machine) the K8s pod is running, therefore, it dynamically attach the volume (it leverages hot-plug/hot-add capability) to particular virtual machine.
    • Note 1: block persistent volumes are attached to virtual machines via PVSCSI driver as it supports higher number (64) of disks and as virtual machine supports up to four (4) SCSI adapters, single VM (K8s worker node) can have up to 256 volumes.
    • Note 2: CSI driver can add additional PVSCSI adapters to VM dynamically
    • Note 3: It only works when VM addvaced setting "devices.hotplug" is enabled, which is default setting.
  • Finally, CSI driver agent detects new storage volume within K8s worker node (linux guest os) and because it knows into which K8s pod (linux container / chroot) the particular volume should be attached, it will attach it to the desired container (pod).

Hope I did not forget something in the automated workflow vSphere CSI driver is doing :-)

I guess now you would ask me, how DevOps admin issues persistent volumes claims into K8s cluster, right?

Well, it is two step process. The first of all, K8s cluster must know K8s Storage Class which is later used for persistent volume claims. Storage Class is just a mapping between vSphere Storage Policy and K8s Storage Class object (aka kind). If you are not yet familiar with VMware vSphere SPBM (Storage Policy Based Management), please read this.

The second step is to create Persistent Volume Claim, describing the particular storage request.

Examples of both Kubernetes (YAML) requests are below. 

 

I believe examples above are self-explanatory. 

Hope this article helps broader VMware user community to understand what is under the cover.

References:

 

No comments: