Skip to main content
Version: v1.5

Azure Kubernetes Service Installation

Prerequisites

  1. Azure Cloud Shell (Bash) with the environment variable ENVSUBST set to the path of an envsubst installation (Azure Cloud Shell does not allow root/sudo package installation).
  2. kubectl with the correct context selected, pointing to the AKS cluster where Luna will be deployed.
    The cluster name passed to the deploy script must match the AKS cluster name in the active kubectl context; otherwise, the deploy script will exit with an error.
  3. Helm: the Kubernetes package manager
  4. cmctl: the cert-manager command-line utility
  5. An existing AKS cluster with at least 2 nodes (required for Luna webhook replica availability), with cluster autoscaling disabled. Note that AKS has both free and standard tier clusters; please ensure your cluster tier can handle your expected load at scale.

Considerations

Pod Subnet

Luna running on AKS supports specifying the pod subnet used by Dynamic Azure CNI networking for bin selection workloads. By default, Luna will use the same pod subnet as your cluster's system node pool; you can override this behavior for your workloads.

If you would like Luna to use a particular subnet (e.g., podsubnet1) that you have set up for your workload, please include the following annotation in your configuration:

annotations:
node.elotl.co/aks-pod-subnet: "podsubnet1"

Managed Identity Authentication Setup

As outlined in Step 2 below, Luna supports two Azure authentication techniques to provide access to an account with the permissions Luna needs to perform its AKS cluster scaling operations.

To use managed identity authentication, define a user-assigned managed identity and grant it the required permissions. At Luna deployment time, you'll provide that managed identity's name in an environment variable and its client id as a parameter. You can create a user-assigned managed identity as shown below:

az identity create --name <user-assigned-identity-name> --resource-group <resource-group-name> --location <cluster-location> --subscription <subscription-id>

And you can assign its permissions to "Contributor" access for both of your cluster's resource groups via:

az role assignment create --assignee <user-assigned-identity-principalId> --role "Contributor" --scope /subscriptions/<subscription-id>/resourceGroups/<resource-group-name>
az role assignment create --assignee <user-assigned-identity-principalId> --role "Contributor" --scope /subscriptions/<subscription-id>/resourceGroups/<node-resource-group-name>

To allow managed identity authentication to work in an AKS cluster, Luna uses Azure’s Workload Identity service for authentication. The AKS cluster must have the workload identity and OIDC issuer features enabled. You can enable these features at AKS cluster creation time or you can add them to an existing AKS cluster via:

az aks update -n <cluster-name> -g <resource-group-name> --enable-oidc-issuer --enable-workload-identity

Step 1 (Optional): Install NVIDIA GPU driver

If you plan to run GPU workloads, install the NVIDIA device plugin:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml

Step 2: Deploy Luna

Luna requires cert-manager to be installed/running in the cluster.
The deploy script will detect an existing cert-manager installation and install it into the cert-manager namespace if it is not already present.

By default, the script will use the namespace "elotl" and the release name "elotl-luna" for deployment. However, you can override these defaults using the --namespace and --helm-release-name options, respectively.

To perform AKS cluster scaling, Luna requires create, read, update, and delete access to node pools in the AKS cluster resource group, read access to VM SKUs, and read/update access to VM scale sets.

To provide Luna with access to an account with the appropriate permissions, you can choose from these two Azure authentication methods: Azure authentication methods

  • Client secret: pass the client secret using the --client-secret option.
  • Managed identity: pass the managed identity name using the --identity-name option. See Managed Identity Authentication Setup above.

Note You can specify either a client secret or a managed identity, but not both.

Luna provisions and manages only the nodes it creates. Existing nodes in the cluster are not modified or removed.

You can then run the following command to deploy Luna into your AKS cluster:

./deploy.sh \
--name <cluster-name> \
--resource-group <resource-group-name> \
--location <cluster-location> \
--subscription <subscription-id> \
--tenant <tenant-id> \
--id <client-id> \
(--identity-name <managed-identity> | --client-secret <client-secret>) \
[--helm-release-name <release-name>] \
[--namespace <namespace>] \
[--additional-helm-values "<additional-helm-values>"]

This command generates:

  • <aks-cluster-name>_values.yaml
  • <aks-cluster-name>_<helm-release-name>_values_full.yaml

These files are useful to retain as a reference or backup for future upgrades.

Note
On AKS, metrics-server pods in the kube-system namespace may prevent nodes from scaling down because they mount local storage (EmptyDir).
Since scaleDown.skipNodesWithLocalStorage is true by default, you can disable this behavior by including:

--set scaleDown.skipNodesWithLocalStorage=false

in <additional-helm-values>.

Step 3: Verify Luna installation

Run:

kubectl get all -n elotl

Sample Output

NAME                                      READY   STATUS    RESTARTS   AGE
pod/elotl-luna-manager-6bd7f4674d-cxwz6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-ltrd6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-svr6b 1/1 Running 0 2m39s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elotl-luna-manager ClusterIP x.x.x.x <none> 9090/TCP 2m39s
service/elotl-luna-webhook ClusterIP x.x.x.x <none> 8443/TCP 2m39s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/elotl-luna-manager 1/1 1 1 2m39s
deployment.apps/elotl-luna-webhook 2/2 2 2 2m39s

NAME DESIRED CURRENT READY AGE
replicaset.apps/elotl-luna-manager-6bd7f4674d 1 1 1 2m39s
replicaset.apps/elotl-luna-webhook-7fcf5998b6 2 2 2 2m39s

Step 4: Test Luna functionality

Follow the tutorial to validate the value provided by Luna.

Step 5: Observe pod placement and node scaling

While testing, observe pod scheduling, dynamic node creation, and removal:

kubectl get pods --selector=elotl-luna=true -o wide -w
kubectl get nodes -w

Upgrade

When running the upgrade command described below, set <retained-values-file> to <retained-path>/<cluster-name>_<helm_release_name>_values_full.yaml.

To upgrade an existing Luna deployment if using managed identity, run:

helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> <additional-helm-values(optional)>

To upgrade an existing Luna deployment if using client secret, run:

helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> --set azure.clientSecret="<client-secret>" <additional-helm-values(optional)>

For example, to upgrade my-cluster from luna-v1.5.1 to luna-v1.5.2 and set an additional helm value binPackingNodeCpu=2, run:

helm upgrade elotl-luna ./elotl-luna-v1.5.2.tgz --wait --namespace=elotl --values=../../luna-v1.5.2/aks/my-cluster_values.yaml --set binPackingNodeCpu=2

And validate the upgrade as follows:

helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
elotl-luna elotl 4 2026-01-14 14:15:30.686251 -0700 PDT deployed elotl-luna-v1.5.2 v1.5.2

Cleanup

Warning

Uninstalling Luna without first removing the pods Luna placed on Luna-managed nodes may result in orphaned nodes.

We recommend deleting all pods running on Luna-managed nodes before uninstalling Luna to avoid leaving orphaned nodes behind.

To remove Luna manager’s Helm chart, run the following:

helm uninstall elotl-luna --namespace=elotl
kubectl delete namespace elotl

If Luna is uninstalled before all Luna-allocated nodes have been scaled down and removed, finalizers may remain on those nodes and prevent deletion. If this occurs, follow the finalizer cleanup instructions

If Luna created a node pool that was not removed before Luna was uninstalled, you may want to delete it manually.