Luna Configuration
Labels
In order for Luna Manager to manage the pods scheduling, the following label must be applied:
metadata:
labels:
elotl-luna: "true"
Instance family inclusion and exclusion
To instruct Luna to avoid starting a given instance family node for a given workload. Luna will look for a following annotation in the pod:
metadata:
annotations:
node.elotl.co/instance-family-exclusions: "t3,t3a"
This will prevent Luna from starting any t3. or t3a. instance type for this pod.
To instruct Luna to only use specific nodes from a given instance family for a given workload. Luna will look for a following annotation in the pod:
metadata:
annotations:
node.elotl.co/instance-family-inclusions: "c6g,c6gd,c6gn,g5g"
This will restrict Luna to using choosing from the following c6g.,c6gd.,c6gn.,g5g. instance types for this pod.
Pods with either of these annotations will be bin-selected, regardless of the pod’s resource requirements.
GPU SKU annotation
To instruct Luna to start an instance with a specific graphic card:
metadata:
annotations:
node.elotl.co/instance-gpu-skus: “v100”
This will start a node with a V100 GPU card.
Each pod with this annotation will be bin-selected, regardless of the pod’s resource requirements.
Advanced configuration via Helm Values
This is a list of the configuration options for Luna. These values can be passed to Helm when deploying Luna.
The keys and values are passed to the deploy script as the follows:
./deploy.sh <cluster-name> <cluster-region> \
--set binSelectPodCpuThreshold=3.0 \
--set binSelectPodMemoryThreshold=2G \
--set binPackingNodeCpu=3250m \
--set binPackingNodeMemory=7Gi \
--set binPackingNodeMinPodCount=42 \
--set binPackingNodeTypeRegexp='^t3a.*$' \
--set labels='key1=value1,key2=value2'
These configuration options can be modified in the configuration map
elotl-luna
located in the namespace where Luna manager runs. Once the
configuration map has been modified Luna manager and it admission webhook
must be restarted for the new configuration to be used.
$ kubectl -n elotl rollout restart deploy/elotl-luna-manager
...
$ kubectl -n elotl rollout restart deploy/elotl-luna-webhook
...
labels
Specify the labels that Luna will use to match the pods to consider.
labels is a list of comma separated key value pairs:
key1=value1\,key2=value2pods with any of the labels will be considered by
Luna. The default value is elotl-luna=true
.
--set key=value
loopPeriod
How often the Luna main loop runs, by default 10 seconds. Increasing this value will ease the load on the Kubernetes control plane, while lowering it will intensify the load on the Kubernetes control plane.
--set loopPeriod=20s
daemonSetSelector
Select the labels from daemon sets that will run on the Luna nodes. It is empty by default.
For example is you wish to run node with a GPU attached you will have to select the GPU driver daemon set.
--set daemonSetSelector=name=nvidia-device-plugin-ds
daemonSetExclude
List of daemon set to exclude from the daemon sets selected by
daemonSetSelector
. It is empty by default.
newPodScaleUpDelay
Age of the pod to be considered for scaling up nodes. It is set to 10 seconds by default.
Because pods creation may be scatered, it isn’t desirable for Luna to immediately react to pod creation. Lowering this delay may result in less efficient packing, while increasing it will delay the creation of the nodes and increase to the mean time to placement of pods.
--set newPodScaleUpDelay=5s
includeArmInstance
Whether to consider Arm instance types. It is set to false by default.
If this option is enabled, all the images of the pods run by Luna must support both the AMD64 and ARM64 architecture. Otherwise pod creation may fail.
placeBoundPVC
Whether to consider pods with bound PVC. It is set to false by default.
reuseBinSelectNodes
Whether to reuse nodes for similar bin-select placed pods. It is set to true by default.
prometheusListenPort
The port number Luna manager and webhook will expose their prometheus metrics on. It is 9090 by default.
clusterGPULimit: 10
The maximum number of GPU to run in the cluster. It is set to 10 by default.
clusterGPULimit specifies the GPU limit of the cluster if gpu count in the cluster reaches this number, luna will stop scaling up GPU nodes
nodeTags
Tags to add to the cloud instances. It is a list of comma separated key value pairs: key1=value1,key2=value2. It is empty by default.
This is useful to clean-up stale nodes:
--set nodeTags=key1=value1,key2=value2
loggingVerbosity
How verbose Luna manager and webhook are. It is set to 2 by default.
0 critical, 1 important, 2 informational, 3 debug
Bin-selection
Bin-selection means running the pod on a dedicated node.
When a pod’s requirements are high enough Luna provisions a dedicated node to
run it. Luna uses the pod’s requirements to determine the node’s optimal
configuration, add a new node to the cluster, and run the pod on it. If the
pod’s cpu requirement is above binSelectPodCpuThreshold
and/or if the
pod’s memory requirement is above binSelectPodMemoryThreshold
, the pod will
be bin-selected and run on a dedicated node.
Bin-packing
Bin-packing means running the pod with other pods on a shared node.
binPackingNodeCpu
and binPackingNodeMemory
let you configure the shared
nodes’ requirement. If you have an instance type in mind, set these
paramaters slightly below the node type you are targeting, to take into
account the kubelet and kube-proxy overhead. For example if you would like to
have nodes with 8 VCPU and 32 GB of memory, set binPackingNodeCpu
to "7.5"
and binPackingNodeMemory
to "28G".
If a pod’s requirements are too much for bin-packing nodes, an over-sized node will be provisioned to handle this pod. For example if configured bin-packing typically have 1 VCPU, and a bin-packed pod needs 1.5 VCPU, a node with 1.5 VCPU will be provisioned by Luna to accommodate this pod. This will only happen when the bin selection thresholds are above the bin packing requirements.
Each node type can only run a limited number of pods.
binPackingNodeMinPodCount
lets you request a node that can support a minimum
number of pods.
binPackingNodeTypeRegexp
allows you to limit the instances that will be
considered. For example is you would only like to run instances from "t3a"
family in AWS you would do: binPackingNodeTypeRegexp='^t3a\..*$'
AWS
This section details AWS specific configuration options
Custom AMIs
NOTE: All custom AMIs must include the script EKS nodes’ bootstrap script at /etc/eks/bootstrap.sh
. Otherwise nodes will not join the cluster.
You can tell Luna to use a specific AMI via the Helm values:
- aws.amiIdGeneric for x86-64 nodes
- aws.amiIdGenericArm for Arm64 nodes
- aws.amiIdGpu for x86-64 nodes with GPU
Each of these configuration options accept an AMI ID. If the AMI doesn’t exist or is not accessible Luna will log an error and falls back on the latest generic EKS images.
Set these custom AMI IDs via helm values like this:
--set aws.amiIdGeneric=ami-1234567890
--set aws.amiIdGenericArm=ami-1234567890
--set aws.amiIdGpu=ami-1234567890