Introduction

The NVMesh CSI Driver Topology feature allows a single CSI driver to manage multiple clusters of NVMesh within a single Kubernetes environment.
The driver topology feature ensures that each pod using a NVMesh-based PVC will only be scheduled on nodes where the volume is accessible from the NVMesh client.

When the topology feature is configured, each NVMesh cluster will be represented as an NVMesh CSI zone.
The driver automatically adds a label on each node in the format nvmesh-csi.excelero.com/zone=<zone name> to have Kubernetes associate each node with a cluster or zone.

The configuration of zones is configured by the administrator in the nvmesh-csi-driver-config ConfigMap. The driver will discover all nodes for z given zone by querying the NVMesh management servers configured for that zone and will save this topology in a new ConfigMap named nvmesh-csi-topology, This ConfigMap should not be modified by the user. When a volume is created, the driver will add nodeAffinity to the PersistentVolume with the zone label to let the Kubernetes scheduler know that all future pods using this PVC should be scheduled only on nodes in the same zone as the NVMesh cluster where the volume was provisioned.

Configuration

To inform the CSI driver of the available zones add the topology field to the nvmesh-csi-driver-config ConfigMap.
Following is an example with a list of all available options.

kind: ConfigMap
apiVersion: v1
metadata:
  name: nvmesh-csi-driver-config
data:
  management.protocol: https
  management.servers: 10.0.1.117:4000
  attachIOEnabledTimeout: "30"
  topology: |-
    {
       "zones": {
          "zone_A": {
             "management": {
                "servers": "worker1.domain.com:4000"
             }
          },
          "zone_B": {
             "management": {
                "servers": "worker4.domain.com:4000"
             }
          }
       }
    }

The topology field is a JSON with a single zones key, which contains the configuration for each zone.
Each key in the zones object is a name of a zone and the value provides the zone configuration parameters.

For each zone configuration, the following fields are available:

Field Description
management Configuration for the management server in this specific zone
management.servers A comma-separated list of management servers addresses in the format address:port, for instance management-1:4000,management-2:4000
management.protocol The management server protocol, i.e. “http” or “https”
management.user The management user to login with, for instance “admin@excelero.com”
management.password The management password, for instance “admin”

Creating Volumes and Pods

Create a PVC and a Pod

Create a StorageClass with volumeBindingMode: WaitForFirstConsumer.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nvmesh-with-topology
provisioner: nvmesh-csi.excelero.com
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
  vpg: DEFAULT_CONCATENATED_VPG

Create a PVC using this StorageClass

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: topology-volume0
spec:
  accessModes:
    - ReadWriteOnce
  volumeMode: Filesystem
  resources:
    requests:
      storage: 1Gi
  storageClassName: nvmesh-wait-for-consumer

Create a Pod that uses the PVC

apiVersion: v1
kind: Pod
metadata:
  name: topology-pod0
spec:
  serviceAccountName: topology-aware
  containers:
    - name: nginx
      image: gcr.io/google_containers/nginx-slim:0.8
      ports:
      - containerPort: 80
        name: web
      volumeMounts:
      - name: www
        mountPath: /usr/share/nginx/html
  volumes:
    - name: www
      persistentVolumeClaim:
        claimName: topology-volume0

Assign the PVC / Pod to a zone using a StorageClass with the topology field

To create volumes on a specific NVMesh cluster, create a StorageClass with the allowedTopologies field.
When a PVC is created from a StorageClass with this field, the CSI driver will create the volume on the desired zone.

Multiple allowedTopologies

If multiple zones are allowed, as in the example below, the CSI driver will randomly pick one of the zones and create the volume on that zone.
The PersistentVolume will then be accessible only on the selected zone and every pod with the same PVC will only be scheduled to that selected zone.
Different PVCs created from the same storageClass may be in different zones.

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: nvmesh-with-topology
provisioner: nvmesh-csi.excelero.com
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer
parameters:
  vpg: DEFAULT_CONCATENATED_VPG
allowedTopologies:
- matchLabelExpressions:
  - key: nvmesh-csi.excelero.com/zone
    values:
    - zone_A
    - zone_B

Assign a PVC or Pod to a zone using the Pod’s nodeAffinity

It is possible to set the nodeAffinity directly on the pod. The PVC and the pod will then be created in the desired zone. In this case, the PVC should use a StorageClass with volumeBindingMode: WaitForFirstConsumer.

apiVersion: v1
kind: Pod
metadata:
  name: topology-pod0
spec:
  serviceAccountName: topology-aware
  spec:
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: nvmesh-csi.excelero.com/zone
            operator: In
            values:
            - zone_A
            - zone_B
  containers:
    - name: nginx
      image: gcr.io/google_containers/nginx-slim:0.8
      ports:
      - containerPort: 80
        name: web
      volumeMounts:
      - name: www
        mountPath: /usr/share/nginx/html
  volumes:
    - name: www
      persistentVolumeClaim:
        claimName: topology-volume0

For a more complex example with StatefulSet, Multiple Zone and antiAffinity on zones, see Topology-Aware Volume Provisioning in Kubernetes

PVC with volumeBindingMode: Immediate

When a PVC with volumeBindingMode: Immediate is created, the NVMesh CSI Driver will randomly pick a zone and provision the volume on that zone.
All subsequent pods using this PVC will be scheduled to this zone.

References

For additional details on VolumeBindingMode, see k8s Documentation – VolumeBindingMode
For additional details on AllowedTopologies, see k8s Documentation – AllowedTopologies

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment