To use Infiniband mode, create the worker’s VMs in the same Availability Set so that they have the same Infiniband pkey critical for Infiniband communications. This is not possible from openshift-installer. Instead, an ARM template is used.

  1. Go to your openshift-install folder location and run ./openshift-install create ignition-configs.
  2. Run cat ./worker.ign | base64 | tr -d '\n' > ignition_base64.
  3. Run cat terraform.tfvars.json | grep cluster_id and record your cluster_id/base name for future use.
  4. Go to https://portal.azure.com/#create/Microsoft.Template and click Build your own template in the editor
  5. Copy and paste the following YAML and click save.
{
  "$schema" : "https://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
  "contentVersion" : "1.0.0.0",
  "parameters" : {
    "baseName" : {
      "type" : "string",
      "minLength" : 1,
      "metadata" : {
        "description" : "Base name to be used in resource names (usually the cluster's Infra ID)"
      }
    },
    "workerIgnition" : {
      "type" : "string",
      "metadata" : {
        "description" : "Ignition content for the worker nodes"
      }
    },
    "numberOfNodes" : {
      "type" : "int",
      "defaultValue" : 3,
      "minValue" : 1,
      "maxValue" : 30,
      "metadata" : {
        "description" : "Number of OpenShift compute nodes to deploy"
      }
    },
    "sshKeyData" : {
      "type" : "securestring",
      "metadata" : {
        "description" : "SSH RSA public key file as a string"
      }
    },
    "availabilitySetName": {
    "type" : "string",
      "metadata" : {
        "description" : "Availability Set Name"
      }
    },
    "nodeVMSize" : {
      "type" : "string",
      "defaultValue" : "Standard_HB120rs_v3",
      "allowedValues" : [
        "Standard_D2s_v3",
        "Standard_D4s_v3",
        "Standard_HB120rs_v3"
      ],
      "metadata" : {
        "description" : "The size of the each Node Virtual Machine"
      }
    }
  },
  "variables" : {
    "location" : "[resourceGroup().location]",
    "virtualNetworkName" : "[concat(parameters('baseName'), '-vnet')]",
    "virtualNetworkID" : "[resourceId('Microsoft.Network/virtualNetworks', variables('virtualNetworkName'))]",
    "nodeSubnetName" : "[concat(parameters('baseName'), '-worker-subnet')]",
    "nodeSubnetRef" : "[concat(variables('virtualNetworkID'), '/subnets/', variables('nodeSubnetName'))]",
    "infraLoadBalancerName" : "[parameters('baseName')]",
    "sshKeyPath" : "/home/capi/.ssh/authorized_keys",
    "identityName" : "[concat(parameters('baseName'), '-identity')]",
    "imageName" : "[concat(parameters('baseName'), '')]",
    "copy" : [
      {
        "name" : "vmNames",
        "count" :  "[parameters('numberOfNodes')]",
        "input" : "[concat(parameters('baseName'), '-worker-', variables('location'), '-', copyIndex('vmNames', 1))]"
      }
    ]
  },
  "resources" : [
	{
		"type": "Microsoft.Compute/availabilitySets",
		"name": "[parameters('availabilitySetName')]",
		"apiVersion": "2019-03-01",
		"location": "[variables('location')]",
		"properties": {
			"platformFaultDomainCount": "3",
			"platformUpdateDomainCount": "5"
		},
		"sku": {
			"name": "Aligned"
        }
	},
    {
      "apiVersion" : "2019-05-01",
      "name" : "[concat('node', copyIndex())]",
      "type" : "Microsoft.Resources/deployments",
      "copy" : {
        "name" : "nodeCopy",
        "count" : "[length(variables('vmNames'))]"
      },
      "dependsOn" : [
        "[resourceId('Microsoft.Compute/availabilitySets', concat(parameters('availabilitySetName')))]"
      ],
      "properties" : {
        "mode" : "Incremental",
        "template" : {
          "$schema" : "http://schema.management.azure.com/schemas/2015-01-01/deploymentTemplate.json#",
          "contentVersion" : "1.0.0.0",
          "resources" : [
            {
              "apiVersion" : "2018-06-01",
              "type" : "Microsoft.Network/networkInterfaces",
              "name" : "[concat(variables('vmNames')[copyIndex()], '-nic')]",
              "location" : "[variables('location')]",
              "properties" : {
                "ipConfigurations" : [
                  {
                    "name" : "pipConfig",
                    "properties" : {
                      "privateIPAllocationMethod" : "Dynamic",
                      "subnet" : {
                        "id" : "[variables('nodeSubnetRef')]"
                      },
                      "loadBalancerBackendAddressPools" : [
                        {
                          "id" : "[resourceId('Microsoft.Network/loadBalancers/backendAddressPools',variables('infraLoadBalancerName'), variables('infraLoadBalancerName'))]"
                        }
                        ]
                    }
                  }
                ]
              }
            },
            {
              "apiVersion" : "2018-06-01",
              "type" : "Microsoft.Compute/virtualMachines",
              "name" : "[variables('vmNames')[copyIndex()]]",
              "location" : "[variables('location')]",
              "tags" : {
                "kubernetes.io-cluster-ffranzupi": "owned"
              },
              "identity" : {
                "type" : "userAssigned",
                "userAssignedIdentities" : {
                  "[resourceID('Microsoft.ManagedIdentity/userAssignedIdentities/', variables('identityName'))]" : {}
                }
              },
              "dependsOn" : [
                "[concat('Microsoft.Network/networkInterfaces/', concat(variables('vmNames')[copyIndex()], '-nic'))]"
              ],
              "properties" : {
                "hardwareProfile" : {
                  "vmSize" : "[parameters('nodeVMSize')]"
                },
                "osProfile" : {
                  "computerName" : "[variables('vmNames')[copyIndex()]]",
                  "adminUsername" : "capi",
                  "customData" : "[parameters('workerIgnition')]",
                  "linuxConfiguration" : {
                    "disablePasswordAuthentication" : true,
                    "ssh" : {
                      "publicKeys" : [
                        {
                          "path" : "[variables('sshKeyPath')]",
                          "keyData" : "[parameters('sshKeyData')]"
                        }
                      ]
                    }
                  }
                },
                "storageProfile" : {
                  "imageReference": {
                    "id": "[resourceId('Microsoft.Compute/images', variables('imageName'))]"
                  },
                  "osDisk" : {
                    "name": "[concat(variables('vmNames')[copyIndex()],'_OSDisk')]",
                    "osType" : "Linux",
                    "createOption" : "FromImage",
                    "managedDisk": {
                      "storageAccountType": "Premium_LRS"
                    },
                    "diskSizeGB": 512
                  }
                },
                "networkProfile" : {
                  "networkInterfaces" : [
                    {
                      "id" : "[resourceId('Microsoft.Network/networkInterfaces', concat(variables('vmNames')[copyIndex()], '-nic'))]",
                      "properties": {
                        "primary": true
                      }
                    }
                  ]
                },
            "availabilitySet": {
                "id": "[resourceId('Microsoft.Compute/availabilitySets', parameters('availabilitySetName'))]"
            }
            }
            }
          ]
        }
      }
    }
  ]
}

Now fill in the following elements.

  • Subscription
  • Resource Group
  • Region – Choose the same region used for the OpenShift cluster
  • Base Name – The cluster_id recorded above
  • Worker Ignition – The content of the ignition_base64 file created above
  • Number Of Nodes – 3
  • Ssh Key Data – the public ssh key itself
  • Availability Set Name – choose any name
  • Node VM Size – choose Standard_HB120rs_v3

Click Review+Create and then Create.
When the creation process ends, accept the new workers VMs using the OC CLI.

  1. Create the oc token, click kube:admin and then copy the login command.

You may need to re-login with the cluster login/password if it timed out.
Click Display Token and copy the login command.

!(zoom)

On a local machine, run the login command. Note: the “oc” command should be in the standard command PATH from the cluster install. Also, you may need to answer “y” to the insecure prompt.

  1. Run oc project default
  2. Run oc get nodes

Use the following link which describes how to accept new nodes to OpenShift cluster: https://docs.openshift.com/container-platform/4.7/installing/installing_azure/installing-azure-user-infra.html#installation-approve-csrs_installing-azure-user-infra, number of new nodes should match the number of workers.

The NVMesh tracer requires 5 * 4 * num_of_cpus threads process IDs, increase the PID limit of the workers. Use the following guide to increase it to 4096 to be on the safe side, How to change the value of pids_limit in OpenShift 4.x

Feedback

Was this helpful?

Yes No
You indicated this topic was not helpful to you ...
Could you please leave a comment telling us why? Thank you!
Thanks for your feedback.

Post your comment on this topic.

Post Comment