Luna Configuration
Labels, Annotations, and Exclusions
Labels
In order for Luna Manager to manage the pods scheduling, the following label must be applied:
metadata:
labels:
elotl-luna: "true"
Instance family exclusion(s)
It’s possible to instruct Luna to avoid starting a given instance family node for a given workload. Luna will look for a following annotation in the pod:
metadata:
annotations:
node.elotl.co/instance-family-exclusions: "t3,t3a"
This will prevent Luna from starting any t3. or t3a. instance type for this pod.
Instance family inclusion(s)
It’s also possible to instruct Luna to only use specific nodes from a given instance family for a given workload. Luna will look for a following annotation in the pod:
metadata:
annotations:
node.elotl.co/instance-family-inclusions: "c6g,c6gd,c6gn,g5g"
This will restrict Luna to using choosing from the following c6g.,c6gd.,c6gn.,g5g. instance types for this pod.
Pods with either of these annotations will be bin-selected, regardless of the pod’s resource requirements.
AWS Fargate
If you would like Luna to run your workload on Fargate nodes, please include the following annotation in your configuration so that Luna can identify your preference:
metadata:
annotations:
node.elotl.co/instance-offerings: "fargate"
Please note that this feature is specific to AWS. Additionally, during the Luna deployment process, you must indicate that you would like to use Fargate in order to utilize this feature.
GPU SKU annotation
It’s possible to instruct Luna to start an instance with a specific graphic card.
metadata:
annotations:
node.elotl.co/instance-gpu-skus: “v100”
This will start a node with a V100 GPU card.
Each pod with this annotation will be bin-selected, regardless of the pod’s resource requirements.
Additional Helm Values
In most cases these do not need to be changed.
labels
is a list of comma separated key value pairs: key1=value1\,key2=value2pods with any of the labels will be considered by Luna. The default value iselotl-luna=true
.binSelectPodCpuThreshold
andbinSelectPodMemoryThreshold
are parameters controlling bin-selection (see below)binPackingNodeCpu
,binPackingNodeMemory
,binPackingNodeMinPodCount
, andbinPackingNodeTypeRegexp
are parameters controlling bin-packing (see below)clusterGPULimit
is an integer that specifies the GPU limit of the cluster if gpu count in the cluster reaches this number, luna will stop scaling up GPU nodes
The keys and values are passed to the deploy script as the follows:
./deploy.sh <cluster-name> <cluster-region> \
--set binSelectPodCpuThreshold=3.0 \
--set binSelectPodMemoryThreshold=2G \
--set binPackingNodeCpu=3250m \
--set binPackingNodeMemory=7Gi \
--set binPackingNodeMinPodCount=42 \
--set binPackingNodeTypeRegexp='^t3a.*$' \
--set labels='key1=value1,key2=value2'
Bin-selection
Bin-selection means running the pod on a dedicated node.
When a pod’s requirements are high enough Luna provisions a dedicated node to
run it. Luna uses the pod’s requirements to determine the node’s optimal
configuration, add a new node to the cluster, and run the pod on it. If the
pod’s cpu requirement is above binSelectPodCpuThreshold
and/or if the
pod’s memory requirement is above binSelectPodMemoryThreshold
, the pod will
be bin-selected and run on a dedicated node.
Bin-packing
Bin-packing means running the pod with other pods on a shared node.
binPackingNodeCpu
and binPackingNodeMemory
let you configure the shared
nodes’ requirement. If you have an instance type in mind, set these
paramaters slightly below the node type you are targeting, to take into
account the kubelet and kube-proxy overhead. For example if you would like to
have nodes with 8 VCPU and 32 GB of memory, set binPackingNodeCpu
to "7.5"
and binPackingNodeMemory
to "28G".
If a pod’s requirements are too much for bin-packing nodes, an over-sized node will be provisioned to handle this pod. For example if configured bin-packing typically have 1 VCPU, and a bin-packed pod needs 1.5 VCPU, a node with 1.5 VCPU will be provisioned by Luna to accommodate this pod. This will only happen when the bin selection thresholds are above the bin packing requirements.
Each node type can only run a limited number of pods.
binPackingNodeMinPodCount
lets you request a node that can support a minimum
number of pods.
binPackingNodeTypeRegexp
allows you to limit the instances that will be
considered. For example is you would only like to run instances from "t3a"
family in AWS you would do: binPackingNodeTypeRegexp='^t3a\..*$'