Skip to main content
Version: v1.5

Elastic Kubernetes Service Installation

Prerequisites

  1. AWS CLI installed and configured
  2. kubectl with the correct context selected, pointing to the EKS cluster where Luna will be deployed.
    The cluster name passed to the deploy script must match the EKS cluster name in the active kubectl context; otherwise, the deploy script will exit with an error.
  3. Helm: the Kubernetes package manager
  4. eksctl >= v0.202.0, used to manage the EKS OpenID Connect (OIDC) provider
  5. cmctl: the cert-manager command-line utility
  6. An existing EKS cluster with at least 2 nodes (required for Luna webhook replica availability).
    If you do not already have a cluster, you can create one with eksctl:
eksctl --region=... create cluster --name=...

Step 1 (Optional): Install NVIDIA GPU driver

If you plan to run GPU workloads, install the NVIDIA device plugin:

kubectl create -f https://raw.githubusercontent.com/NVIDIA/k8s-device-plugin/v0.17.0/deployments/static/nvidia-device-plugin.yml

Step 2: Deploy Luna

Luna requires cert-manager to be installed/running in the cluster.
The deploy script will detect an existing cert-manager installation and install it into the cert-manager namespace if it is not already present.

By default, the script will use the namespace "elotl" and the release name "elotl-luna" for deployment. However, you can override these defaults using the --namespace and --helm-release-name options, respectively.

Luna provisions and manages only the nodes it creates. Existing nodes in the cluster are not modified or removed.

You can then run the following command to deploy Luna into your EKS cluster:


./deploy.sh --name <cluster-name> \
--region <compute-region> \
[--helm-release-name <release-name>] \
[--namespace <namespace>] \
[--additional-helm-values "<additional-helm-values>"]

This command generates:

  • <eks-cluster-name>_values.yaml
  • <eks-cluster-name>_<helm-release-name>_values_full.yaml

These files are useful to retain as a reference or backup for future upgrades.

Step 3: Verify Luna installation

Run:

kubectl get all -n elotl

Sample output

NAME                                READY   STATUS    RESTARTS   AGE
pod/luna-manager-5d8578565d-86jwc 1/1 Running 0 56s
pod/luna-webhook-58b7b5dcfb-dwpcb 1/1 Running 0 56s
pod/luna-webhook-58b7b5dcfb-xmlds 1/1 Running 0 56s

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/luna-webhook ClusterIP x.x.x.x <none> 8443/TCP 57s

NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/luna-manager 1/1 1 1 57s
deployment.apps/luna-webhook 2/2 2 2 57s

NAME DESIRED CURRENT READY AGE
replicaset.apps/luna-manager-5d8578565d 1 1 1 57s
replicaset.apps/luna-webhook-58b7b5dcfb 2 2 2 57s

Step 4: Test Luna functionality

Follow the tutorial to validate the value provided by Luna.

Step 5: Observe pod placement and node scaling

While testing, observe pod scheduling, dynamic node creation, and removal:

kubectl get pods --selector=elotl-luna=true -o wide -w
kubectl get nodes -w

Upgrade

When running the upgrade command described below, set <retained-values-file> to <retained-path>/<cluster-name>_<helm_release_name>_values_full.yaml.

To upgrade an existing Luna deployment, run:

helm upgrade elotl-luna <chart-path> --wait --namespace=<cluster-namespace> --values=<retained-values-file> <additional-helm-values(optional)>

For example, to upgrade my-cluster from luna-v1.5.1 to luna-v1.5.2 and set an additional helm value binPackingNodeCpu=2, run:

helm upgrade elotl-luna ./elotl-luna-v1.5.2.tgz --wait --namespace=elotl --values=../../luna-v1.5.2/eks/my-cluster_values.yaml --set binPackingNodeCpu=2

And validate the upgrade as follows:

helm ls -A
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
elotl-luna elotl 4 2026-01-14 14:15:30.686251 -0700 PDT deployed elotl-luna-v1.5.2 v1.5.2

Cleanup

Warning

Uninstalling Luna without first removing the pods Luna placed on Luna-managed nodes may result in orphaned nodes.

We recommend deleting all pods running on Luna-managed nodes before uninstalling Luna to avoid leaving orphaned nodes behind.

The uninstall.sh script won’t remove orphan nodes to prevent accidentally knocking out critical workloads. These orphan nodes can be easily cleaned up, as described below.

To remove Luna manager’s Helm chart and the custom AWS resources created to run Luna execute the uninstall script:

./uninstall.sh <cluster-name> <region>

This will not remove the left over nodes that Luna manager hasn’t scaled down yet. To get the list of orphan nodes’ instance IDs you can use the following command replacing <eks-cluster-name> with the name of the cluster:

aws ec2 describe-instances \
--filters Name=tag:elotl.co/nodeless-cluster/name/<eks-cluster-name>,Values=owned \
--query "Reservations[*].Instances[*].[InstanceId]" \
--output text

To ensure all the nodes managed by Luna manager are deleted, execute the following command replacing <eks-cluster-name> with the name of the cluster:

aws ec2 terminate-instances --instance-ids \
$(aws ec2 describe-instances \
--filters Name=tag:elotl.co/nodeless-cluster/name/<eks-cluster-name>,Values=owned \
--query "Reservations[*].Instances[*].[InstanceId]" \
--output text)

Note that all the pods running on these nodes will be forcefully terminated.

To delete Luna manager and the web hook from the cluster while preserving the AWS resources, execute the following:

helm uninstall elotl-luna --namespace=elotl
kubectl delete namespace elotl

If you decide to uninstall the Helm chart instead of running uninstall.sh, please ensure that all the orphan nodes have been cleaned as described above.

If Luna is uninstalled before all Luna-allocated nodes have been scaled down and removed, finalizers may remain on those nodes and prevent deletion. If this occurs, follow the finalizer cleanup instructions

Notes

Security Groups

Security Groups act as virtual firewalls for EC2 instances to control incoming and outgoing traffic. If a required security group rule is missing, Luna may be unable to attach nodes to the EKS cluster or nodes may fail to run pods and services.

To ensure that all the security groups required by EKS are applied to the Luna-managed nodes, we tag security groups with the key elotl.co/nodeless-cluster/name and the cluster name as its value. When it starts, When it starts, Luna queries which security groups are required and adds them to the nodes.

When Luna is deployed, the default EKS security groups are automatically tagged. If you wish to tag another security group you can use awscli to add the tags to the security group:

aws --region=<region> \
ec2 create-tags \
--resources <security group id> \
--tags "Key=elotl.co/nodeless-cluster/name,Value=<cluster_name>"

Once tagged, you must restart the Luna manager pod for Luna to assign the new security group to the newly provisioned nodes. Note that existing Luna-managed nodes will not have their security groups updated and will have to be replaced to get the new security group assignment working.