Azure Kubernetes Service Installation
Prerequisites
- Azure Cloud Shell (Bash) with the environment variable
ENVSUBSTset to the path of anenvsubstinstallation (Azure Cloud Shell does not allow root/sudo package installation). - kubectl with the correct context selected, pointing to the AKS cluster where Luna will be deployed.
The cluster name passed to the deploy script must match the AKS cluster name in the active kubectl context; otherwise, the deploy script will exit with an error. - Helm: the Kubernetes package manager
- cmctl: the cert-manager command-line utility
- An existing AKS cluster with at least 2 nodes (required for Luna webhook replica availability), with cluster autoscaling disabled. Note that AKS has both free and standard tier clusters; please ensure your cluster tier can handle your expected load at scale.
Considerations
Pod Subnet
Luna running on AKS supports specifying the pod subnet used by Dynamic Azure CNI networking for bin selection workloads. By default, Luna will use the same pod subnet as your cluster's system node pool; you can override this behavior for your workloads.
If you would like Luna to use a particular subnet (e.g., podsubnet1) that you have set up for your workload, please include the following annotation in your configuration:
annotations:
node.elotl.co/aks-pod-subnet: "podsubnet1"
Managed Identity Authentication Setup
As outlined in Step 2 below, Luna supports two Azure authentication techniques to provide access to an account with the permissions Luna needs to perform its AKS cluster scaling operations.
To use managed identity authentication, define a user-assigned managed identity and grant it the required permissions. At Luna deployment time, you'll provide that managed identity's name in an environment variable and its client id as a parameter. You can create a user-assigned managed identity as shown below:
az identity create --name <user-assigned-identity-name> --resource-group <resource-group-name> --location <cluster-location> --subscription <subscription-id>
And you can assign its permissions to "Contributor" access for both of your cluster's resource groups via:
az role assignment create --assignee <user-assigned-identity-principalId> --role "Contributor" --scope /subscriptions/<subscription-id>/resourceGroups/<resource-group-name>
az role assignment create --assignee <user-assigned-identity-principalId> --role "Contributor" --scope /subscriptions/<subscription-id>/resourceGroups/<node-resource-group-name>
To allow managed identity authentication to work in an AKS cluster, Luna uses Azure’s Workload Identity service for authentication. The AKS cluster must have the workload identity and OIDC issuer features enabled. You can enable these features at AKS cluster creation time or you can add them to an existing AKS cluster via:
az aks update -n <cluster-name> -g <resource-group-name> --enable-oidc-issuer --enable-workload-identity
Step 1 (Optional): Install NVIDIA GPU driver
If you plan to run GPU workloads, install the NVIDIA device plugin:
kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml
Step 2: Deploy Luna
Luna requires cert-manager to be installed/running in the cluster.
The deploy script will detect an existing cert-manager installation and install it into the cert-manager namespace if it is not already present.
By default, the script will use the namespace "elotl" and the release name "elotl-luna" for deployment. However, you can override these defaults using the --namespace and --helm-release-name options, respectively.
To perform AKS cluster scaling, Luna requires create, read, update, and delete access to node pools in the AKS cluster resource group, read access to VM SKUs, and read/update access to VM scale sets.
To provide Luna with access to an account with the appropriate permissions, you can choose from these two Azure authentication methods: Azure authentication methods
- Client secret: pass the client secret using the
--client-secretoption. - Managed identity: pass the managed identity name using the
--identity-nameoption. See Managed Identity Authentication Setup above.
Note You can specify either a client secret or a managed identity, but not both.
Luna provisions and manages only the nodes it creates. Existing nodes in the cluster are not modified or removed.
You can then run the following command to deploy Luna into your AKS cluster:
./deploy.sh \
--name <cluster-name> \
--resource-group <resource-group-name> \
--location <cluster-location> \
--subscription <subscription-id> \
--tenant <tenant-id> \
--id <client-id> \
(--identity-name <managed-identity> | --client-secret <client-secret>) \
[--helm-release-name <release-name>] \
[--namespace <namespace>] \
[--additional-helm-values "<additional-helm-values>"]
This command generates:
<aks-cluster-name>_values.yaml<aks-cluster-name>_<helm-release-name>_values_full.yaml
These files are useful to retain as a reference or backup for future upgrades.
Note
On AKS,metrics-serverpods in thekube-systemnamespace may prevent nodes from scaling down because they mount local storage (EmptyDir).
SincescaleDown.skipNodesWithLocalStorageistrueby default, you can disable this behavior by including:
--set scaleDown.skipNodesWithLocalStorage=falsein
<additional-helm-values>.
Step 3: Verify Luna installation
Run:
kubectl get all -n elotl
Sample Output
NAME READY STATUS RESTARTS AGE
pod/elotl-luna-manager-6bd7f4674d-cxwz6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-ltrd6 1/1 Running 0 2m39s
pod/elotl-luna-webhook-7fcf5998b6-svr6b 1/1 Running 0 2m39s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/elotl-luna-manager ClusterIP x.x.x.x <none> 9090/TCP 2m39s
service/elotl-luna-webhook ClusterIP x.x.x.x <none> 8443/TCP 2m39s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/elotl-luna-manager 1/1 1 1 2m39s
deployment.apps/elotl-luna-webhook 2/2 2 2 2m39s
NAME DESIRED CURRENT READY AGE
replicaset.apps/elotl-luna-manager-6bd7f4674d 1 1 1 2m39s
replicaset.apps/elotl-luna-webhook-7fcf5998b6 2 2 2 2m39s
Step 4: Test Luna functionality
Follow the tutorial to validate the value provided by Luna.
Step 5: Observe pod placement and node scaling
While testing, observe pod scheduling, dynamic node creation, and removal:
kubectl get pods --selector=elotl-luna=true -o wide -w
kubectl get nodes -w
Upgrade
When running the upgrade command described below, set <retained-values-file> to <retained-path>/<cluster-name>_<helm_release_name>_values_full.yaml.
To upgrade an existing Luna deployment if using managed identity, run:
helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> <additional-helm-values(optional)>
To upgrade an existing Luna deployment if using client secret, run:
helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> --set azure.clientSecret="<client-secret>" <additional-helm-values(optional)>
For example, to upgrade my-cluster from luna-v1.5.1 to luna-v1.5.2 and set an additional helm value binPackingNodeCpu=2, run:
helm upgrade elotl-luna ./elotl-luna-v1.5.2.tgz --wait --namespace=elotl --values=../../luna-v1.5.2/aks/my-cluster_values.yaml --set binPackingNodeCpu=2
And validate the upgrade as follows:
helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
elotl-luna elotl 4 2026-01-14 14:15:30.686251 -0700 PDT deployed elotl-luna-v1.5.2 v1.5.2
Cleanup
Warning
Uninstalling Luna without first removing the pods Luna placed on Luna-managed nodes may result in orphaned nodes.
We recommend deleting all pods running on Luna-managed nodes before uninstalling Luna to avoid leaving orphaned nodes behind.
To remove Luna manager’s Helm chart, run the following:
helm uninstall elotl-luna --namespace=elotl
kubectl delete namespace elotl
If Luna is uninstalled before all Luna-allocated nodes have been scaled down and removed, finalizers may remain on those nodes and prevent deletion. If this occurs, follow the finalizer cleanup instructions
If Luna created a node pool that was not removed before Luna was uninstalled, you may want to delete it manually.