Skip to main content
Version: v1.1

Luna Configuration

Pod configuration

In order for Luna Manager to manage a pod's scheduling, the pod configuration must include a label or annotation that matches Luna's configured pod designation setting. By default, Luna's setting specifies that the following label is applied:

metadata:
labels:
elotl-luna: "true"

You can change the list of labels Luna will consider with the labels Helm value:

--set labels='key1=value1,key2=value2'

You can change the list of annotations Luna will consider with the podAnnotations Helm value:

--set podAnnotations='key1=value1,key2=value2'

To prevent Luna from matching a given pod annotate it with pod.elotl.co/ignore: true.

Instance family configuration

Bin selection

To avoid a given instance family, annotate the pod like this:

metadata:
annotations:
node.elotl.co/instance-family-exclusions: "t3,t3a"

In the example above Luna won’t start any t3 or t3a instance type for the pod.

To use a given instance family, annotate the pod like this:

metadata:
annotations:
node.elotl.co/instance-family-inclusions: "c6g,c6gd,c6gn,g5g"

In the example above Luna will choose an instance type from the c6g, c6gd, c6gn, or g5g instance families for the pod.

To specify the instance type, you can utilize a regular expression. For intance, if you'd like to specify the instance type to be r6a.xlarge, annotate the pod like this:

metadata:
annotations:
node.elotl.co/instance-type-regexp: "^r6a.xlarge$"

In the example above, Luna will only consider the r6a.xlarge instance type.

You can combine the instance-type and instance-family annotations like this:

metadata:
annotations:
"node.elotl.co/instance-type-regexp": "^*.xlarge$",
"node.elotl.co/instance-family-exclusions": "r6a",

In the example above, Luna will exclusively consider instance types ending with ".xlarge" and exclude types from the r6a family.

If any of these annotations are present, Luna will schedule the pods on nodes that fulfill all these constraints as well as the resource requirements of the pods. However, if the instance type constraints and the pod's resource requirements are incompatible, no node will be added and the pod will be stuck in the pending state.

Bin packing

Bin packing instance family and type can be configured via the global option binPackingNodeTypeRegexp. Only the instances matching the regular expression will be considered.

For example if you would like to use t3a nodes in AWS, you would set: binPackingNodeTypeRegexp='^t3a\..*$'.

Removal of Under-utilized nodes and pod eviction

Luna is designed to remove nodes deemed as under-utilized. A node falls under this category if the pods operating on it require less than 20% of the node's CPU or memory for bin packing, or if they require less than 75% of the node's CPU or memory for bin selection. If a node has consistently been under-utilized for a duration exceeding the one set in scaleDown.nodeUnneededDuration—which defaults to 5 minutes—, Luna will proceed to evict the pods operating on this node and subsequently remove it.

To avoid eviction of a pod running on an under-utilized node by Luna, the pod must be annotated with pod.elotl.co/do-not-evict: true as shown below:

apiVersion: v1
kind: Pod
metadata:
name: my-pod
annotations:
pod.elotl.co/do-not-evict: "true"
spec:
...

The annotation cluster-autoscaler.kubernetes.io/safe-to-evict: false is also supported.

GPU SKU annotation

To instruct Luna to start an instance with a specific graphic card:

metadata:
annotations:
node.elotl.co/instance-gpu-skus: “v100”

This will start a node with a V100 GPU card.

note

Each pod with this annotation will be bin-selected, regardless of the pod’s resource requirements.

Advanced configuration via Helm Values

This is a list of the configuration options for Luna. These values can be passed to Helm when deploying Luna.

The keys and values are passed to the deploy script as follows:

./deploy.sh <cluster-name> <cluster-region> \
--set binSelectPodCpuThreshold=3.0 \
--set binSelectPodMemoryThreshold=2Gi \
--set binSelectPodGPUThreshold=1 \
--set binPackingNodeCpu=3250m \
--set binPackingNodeMemory=7Gi \
--set binPackingNodeMinPodCount=42 \
--set binPackingNodeTypeRegexp='^t3a.*$' \
--set binPackingNodePricing='spot,on-demand' \
--set labels='key1=value1,key2=value2'

These configuration options can be modified in the configuration map elotl-luna located in the namespace where Luna manager runs. Once the configuration map has been modified Luna manager and its admission webhook must be restarted for the new configuration to be used.

$ kubectl -n elotl rollout restart deploy/elotl-luna-manager
...
$ kubectl -n elotl rollout restart deploy/elotl-luna-webhook
...

labels

Specify the labels that Luna will use to match the pods to consider.

labels is a list of comma separated key value pairs: key1=value1\,key2=value2; pods with any of the labels will be considered by Luna. The default value is elotl-luna=true.

--set labels='key1=value1\,key2=value2'

podAnnotations

Specify the annotations that Luna will use to match the pods to consider.

Similar to labels, podAnnotations is a list of comma separated key value pairs: key1=value1\,key2=value2; pods with any of the annotations will be considered by Luna. podAnnotations is empty by default.

--set podAnnotations='key1=value1\,key2=value2'

pod.elotl.co/ignore: true

This annotation instructs Luna to ignore a given pod even if it matches labels or podAnnotations.

loopPeriod

How often the Luna main loop runs, by default 10 seconds. Increasing this value will ease the load on the Kubernetes control plane, while lowering it will intensify the load on the Kubernetes control plane.

--set loopPeriod=20s

daemonSetSelector

Select the labels from daemon sets that will run on the Luna nodes. It is empty by default.

For example if you wish to run a node with a GPU attached you will have to select the GPU driver daemon set.

--set daemonSetSelector=name=nvidia-device-plugin-ds

daemonSetExclude

Comma-separated list of names of daemon sets to exclude from those Luna assumes may be active on newly added nodes. It is empty by default.

Use this option to avoid Luna's reserving resources for daemon sets that you do not expect to active on new nodes. For example, the following could be used for Luna on a GKE cluster for which you only plan to use --logging-variant=DEFAULT.

--set daemonSetExclude="fluentbit-gke-256pd\,fluentbit-gke-max\,gke-metrics-agent-scaling-500"

newPodScaleUpDelay

Age of the pod to be considered for scaling up nodes. It is set to 10 seconds by default.

Because pod creation may be scattered, it isn’t desirable for Luna to immediately react to pod creation. Lowering this delay may result in less efficient packing, while increasing it will delay the creation of the nodes and increase the mean time to placement of pods.

--set newPodScaleUpDelay=5s

scaleUpTimeout

Time to allow for the new node to be added and the pending pod to be scheduled before considering the scale up operation expired and subject to retry. It is set to 10 minutes by default. This value can be tuned for the target cloud.

includeArmInstance

Whether to consider Arm instance types. It is set to false by default.

If this option is enabled, all the images of the pods run by Luna must support both the AMD64 and ARM64 architecture. Otherwise pod creation may fail.

placeBoundPVC

Whether to consider pods with bound PVC. It is set to false by default.

placeNodeSelector

Whether to consider pods with existing node selector(s). It is set to false by default. When set to true, a pod's existing node selector(s) must be satisfiable by the Luna and pod settings; otherwise, Luna may allocate a node that cannot be used by the pod.

namespacesExclude

List of comma-separated names of namespaces whose pods should be excluded from Luna management. It is set to kube-system only by default. For example, to run with no namespace restrictions on Luna management, use:

--set namespacesExclude={}

To add the namespace test to the exclusion list specify:

--set namespacesExclude='{kube-system,test}'

Note that if the kube-system namespace is not part of the namespacesExclude list, Luna can spin up additional nodes for kube-system pods marked for luna placement that are in the Pending state for too long.

reuseBinSelectNodes

Whether to reuse nodes for similar bin-select placed pods. It is set to true by default.

skipIgnoredPods

Whether to add a node selector to pods not labeled for placement by Luna or to skip adding a node selector to such pods. It is set to false by default.

By default, the Luna webhook sets a node selector for each non-daemonset pod placement request it examines. If a pod is labeled for placement by Luna, its node selector is set to point to a Luna-created node. If a pod is not labeled for placement by Luna, its node selector is set to exclude any Luna-created node; the latter setting is skipped if skipIgnoredPods is set true.

prometheusListenPort

The port number on which Luna manager and webhook will expose their prometheus metrics. It is 9090 by default.

clusterGPULimit: 10

The maximum number of GPUs to run in the cluster. It is set to 10 by default.

clusterGPULimit specifies the GPU limit of the cluster; if gpu count in the cluster reaches this number, luna will stop scaling up GPU nodes.

nodeTags

Tags to add to the cloud instances. It is a list of comma separated key value pairs: key1=value1,key2=value2. It is empty by default.

This is useful to clean-up stale nodes:

--set nodeTags=key1=value1,key2=value2

nodeTaints

To add taints to the nodes created by Luna, use the taints configuration option:

--set nodeTaints='{key1=value1:NoSchedule,key2=value2:NoExecute}'

loggingVerbosity

How verbose Luna manager and webhook are. It is set to 2 by default.

0 critical, 1 important, 2 informational, 3 debug

scaleDown.nodeUnneededDuration

If a node remains idle for longer than nodeUnneededDuration, Luna manager will scale it down. Default: 5m.

--set scaleDown.nodeUnneededDuration=1m

scaleDown.skipNodeWithSystemPods

Determines whether to skip nodes running pods from the kube-system namespace. Daemonset pods are never considered by Luna; this only applies to deployment pods. Default: false.

scaleDown.skipNodesWithLocalStorage

When true, Luna manager will never scale down nodes with local storage attached to a pod. Default: true.

scaleDown.skipEvictDaemonSetPods

When true, Luna manager will skip evicting daemonset pods from nodes removed for scale down. Default: false.

scaleDown.minReplicaCount

The minimum replica count ensures that the specified number of replicas are always available during node scale-down. Default: 0.

scaleDown.binPackNodeUtilizationThreshold

Defines the utilization threshold to scale down bin-packed nodes, ranging from 0.0 (0% utilization) to 1.0 (100% utilization). Default: 0.1 (10%).

scaleDown.minNodeCountPerZone

For clusters supporting zone spread (currently only EKS clusters and GKE regional clusters), indicates the minimum number of nodes (0 or 1) that Luna should keep running per zone in target pools into which zone spread pods may be placed. This minimum is maintained even when no normal (not daemonset or mirror) Luna pods are currently running in the pool. Default: 0. Note that EKS does not support setting this value to 1.

In general, Luna keeps a minimum of 1 node per zone in node pools that may be used for zone spread, to ensure kube-scheduler can see all the zones in its target node set and hence can make the desired zone spread choices. Setting scaleDown.minNodeCountPerZone to 1 to maintain a min of 1 node per zone even when the associated count of normal (not daemonset or mirror) Luna pods is 0 avoids a possible race where kube-scheduler sees zone-spread pods arrive for scheduling when some but not all of a node pool's per-zone nodes have scaled down.

scaleDown.nodeTTL

When > 0, enables Luna support for node time-to-live. When scaleDown.nodeTTL is set to a non-zero value, it must be set to a value greater than or equal to scaleUpTimeout. If scaleDown.nodeTTL is less than scaleUpTimeout, Luna will set it to scaleUpTimeout internally and will emit a warning in the logs. Default: 0m (time-to-live unlimited).

When scaleDown.nodeTTL is set to a non-zero value, Luna uses the value as a time-to-live for its allocated nodes; Luna cordons, drains, and terminates its allocated nodes once they have been running longer than the specified scaleDown.nodeTTL time.

If a nodeTTL-expired node contains any pods with do-not-evict annotatations (i.e., pod.elotl.co/do-not-evict:true or cluster-autoscaler.kubernetes.io/safe-to-evict:false), Luna supports the node's graceful termination by cordoning it, draining its non-kube-system non-daemonset pods except the do-not-evict pods, and then adding the configurable annotation scaleDown.drainedAnnotation to it. An external controller monitoring nodes for that annotation can perform eviction-related operations with respect to the do-not-evict pods and then remove the their do-not-evict annotation. Once a nodeTTL-expired node contains no do-not-evict pods, Luna terminates the node.

scaleDown.managedNodeDelete

Set true to enable Luna support for graceful termination of nodes that are externally-deleted (e.g., "kubectl delete node/node-name"). Default: true.

When scaleDown.managedNodeDelete is set true, Luna adds a finalizer to its allocated nodes, allowing Luna to detect external deletion operations on those nodes. When Luna detects external deletion of an allocated node, if that node contains any do-not-evict pods, Luna performs the graceful termination steps outlined in scaleDown.nodeTTL. Once an externally-deleted Luna-allocated node contains no do-not-evict pods, Luna removes its finalizer from blocking the K8s node deletion and deletes the node from the cloud.

Note that if scaleDown.managedNodeDelete is set, the deletion of Luna-allocated nodes requires the removal of the Luna finalizer; hence, if Luna is disabled with some of its allocated nodes remaining and you later want to remove those nodes, you will need to manually remove the finalizer.

scaleDown.drainedAnnotation

Annotation used during graceful node termination; see scaleDown.nodeTTL or scaleDown.managedNodeDelete. Default: key: node.elotl.co/drained; value: true.

Pod retry

Luna cannot guarantee that a pod will run on one of its node, the node and pod have to be properly configured. If a pod is still in the pending state once the requested node is online, Luna will retry after configurable delay, up to a configurable number of times.

How pod retry works:

  1. A new pod is created, the Luna webhook matches it, and a new node is provisioned by Luna manager.
  2. Luna manager waits for the node to come online or wait until scaleUpTimeout has passed, whichever happens first.
  3. Once the node is online or the request has timed out, Luna checks the pod’s status after podRetryPeriod elapsed.
  4. If the pod is still in the pending state we have two cases:
    1. The pod has been retried less than maxPodRetries times, the annotation pod.elotl.co/retry-count is added or incremented to the pod, and the pod will be retries after podRetryPeriod.
    2. The pod has been retried maxPodRetries times, the annotation pod.elotl.co/ignore: true is added to the pod. The pod will now be ignored by Luna until the annotation is removed.

maxPodRetries

Sets the maximum retry attempts for a pod. Each retry increments the annotation pod.elotl.co/retry-count on the pod. Once this limit is exceeded, the pod is annotated with pod.elotl.co/ignore: true, indicating Luna should ignore the pod until the annotation is removed.

Default: 3

podRetryPeriod

Determines the delay before Luna retries deploying a pod that remains in the pending state, even after its node is available. This period must allow adequate time for Kubernetes to schedule the pod, otherwise Luna may create unnecessary node(s) temporarily.

Default: 5 minutes

Bin-selection

Bin-selection means running the pod on a dedicated node.

When a pod’s requirements are high enough Luna provisions a dedicated node to run it. Luna uses the pod’s requirements to determine the node’s optimal configuration, add a new node to the cluster, and run the pod on it. If the pod’s cpu requirement is at or above binSelectPodCpuThreshold and/or if the pod’s memory requirement is at or above binSelectPodMemoryThreshold and/or if the pod's gpu requirement is at or above binSelectPodGPUThreshold, the pod will be bin-selected and run on a dedicated node.

Bin-packing

Bin-packing means running the pod with other pods on a shared node.

binPackingNodeCpu, binPackingNodeMemory, and binPackingNodeGPU let you configure the shared nodes’ requirement. If you have an instance type in mind, set these paramaters slightly below the node type you are targeting, to take into account the kubelet and kube-proxy overhead. For example if you would like to have non-GPU nodes with 8 VCPU and 32 GB of memory, set binPackingNodeCpu to "7.5" and binPackingNodeMemory to "28G".

If a pod’s requirements are too much for bin-packing nodes, an over-sized node will be provisioned to handle this pod. For example if configured bin-packing typically have 1 VCPU, and a bin-packed pod needs 1.5 VCPU, a node with 1.5 VCPU will be provisioned by Luna to accommodate this pod. This will only happen when the bin selection thresholds are above the bin packing requirements.

Each node type can only run a limited number of pods. binPackingNodeMinPodCount lets you request a node that can support a minimum number of pods.

binPackingNodeTypeRegexp allows you to limit the instances that will be considered. For example if you would only like to run instances from "t3a" family in AWS you would do: binPackingNodeTypeRegexp='^t3a\..*$'

binPackingNodePricing allows you to indicate the price offerings category for the instances that will be considered. For example if you would only like to run instances from the "spot" category you would do: binPackingNodePricing='spot'

binPackingMinimumNodeCount allows you to specify the minimum number of bin packed nodes. The nodes will be started immediately and will stay online even if no pods are running on them.

Luna’s own deployment and pod configuration

Annotations, tolerations, and affinity

Use the Helm value annotations to add custom annotations to Luna manager and webhook deployments:

$ helm install ... --set annotations.foo=bar --set annotations.hello=word ...

To add custom tolerations to Luna’s own pods use the configuration option tolerations.

The tolerations specification is rather complex, therefore we recommend you define it in a Helm values file and pass its filename with the -f or --values options:

$ cat tolerations.yaml
tolerations:
- key: "foo"
value: "bar"
operator: "Equal"
effect: "NoSchedule"
$ helm install ... --values tolerations.yaml ...

To add custom affinity to Luna’s own pods use the configuration option affinity.

The affinity specification is rather complex, therefore we recommend you define it in a Helm values file and pass its filename with the -f or --values options:

$ cat affinity.yaml
# Helm values
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: topology.kubernetes.io/zone
operator: In
values:
- us-central1-f
$ helm install ... --values affinity.yaml ...

Note that setting the affinity parameter will override the default affinity which prevent Luna pods from running on Luna managed nodes:

affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: node.elotl.co/managed-by
operator: DoesNotExist

Add this snippet to your own affinity definition to prevent Luna pods from running on Luna managed nodes.

Webhook port

You can change the port of the mutation webhook with webhookPort configuration option:

$ helm install ... --set webhookPort=8999 ...

AWS

This section details AWS specific configuration options.

Custom AMIs

NOTE: All custom AMIs must include the script EKS nodes’ bootstrap script at /etc/eks/bootstrap.sh. Otherwise nodes will not join the cluster.

You can tell Luna to use a specific AMI via the Helm values:

  1. aws.amiIdGeneric for x86-64 nodes
  2. aws.amiIdGenericArm for Arm64 nodes
  3. aws.amiIdGpu for x86-64 nodes with GPU

Each of these configuration options accept an AMI ID. If the AMI doesn’t exist or is not accessible Luna will log an error and fall back to the latest generic EKS images.

Set these custom AMI IDs via helm values like this:

--set aws.amiIdGeneric=ami-1234567890
--set aws.amiIdGenericArm=ami-1234567890
--set aws.amiIdGpu=ami-1234567890

Custom AMIs with SSM

Amazon offers various EKS image families like Amazon Linux, Ubuntu, and BottleRocket. Luna can use AWS SSM to fetch the most up to date image from its store.

For Amazon Linux, you can get the latest EKS image for Kubernetes 1.27 on arm64 nodes at /aws/service/eks/optimized-ami/1.27/amazon-linux-2-arm64/recommended/image_id.

To configure a SSM query for each image types use imageSsmQueryGeneric, imageSsmQueryGenericArm, and imageSsmQueryGpu. All these parameters may include exactly one "%s" marker to replace with the Kubernetes version.

For example here’s how to use BottleRocket images:

--set aws.imageSsmQueryGeneric="/aws/service/bottlerocket/aws-k8s-%s/x86_64/latest/image_id"
--set aws.imageSsmQueryGenericArm="/aws/service/bottlerocket/aws-k8s-%s/arm64/latest/image_id"
--set aws.imageSsmQueryGpu="/aws/service/bottlerocket/aws-k8s-%s-nvidia/x86_64/latest/image_id"

To use Ubuntu:

--set aws.imageSsmQueryGeneric="/aws/service/canonical/ubuntu/eks/20.04/%s/stable/current/amd64/hvm/ebs-gp2/ami-id"
--set aws.imageSsmQueryGenericArm="/aws/service/canonical/ubuntu/eks/20.04/%s/stable/current/arm64/hvm/ebs-gp2/ami-id"

Block device mappings

To customize disk settings for your EKS nodes, use the aws.blockDeviceMappings option. Configure it with JSON with a format like this:

[
{
"DeviceName": "/dev/xvda",
"Ebs": {
"DeleteOnTermination": true,
"VolumeSize": 42,
"VolumeType": "gp2",
"Encrypted": false
}
}
]

Use Helm’s --set-string, --set-json or --set-file options to set aws.blockDeviceMappings and avoid --set since it mangles its input.

For example:

$ cat block_device_mapping.json
[
{
"DeviceName": "/dev/xvda",
"Ebs": {
"DeleteOnTermination": true,
"VolumeSize": 42,
"VolumeType": "gp2",
"Encrypted": false
}
}
]
$ helm ... --set-file aws.blockDeviceMappings=block_device_mapping.json

Max Pods per Node

When aws.maxPodsPerNode is 0 (the default), Luna uses the ENI-limited max pods per node value calculated as specified by AWS, and does not explicitly set it on Luna-allocated nodes. When aws.maxPodsPerNode is greater than 0, Luna uses and explicitly sets the specified value on Luna-allocated nodes.

Bin Packing Zone Spread

When aws.binPackingZoneSpread is true (default false), Luna supports placement of bin packing pods that specify zone spread. To support bin packing zone spread, Luna keeps at least one bin packing node running in each zone associated with the EKS cluster as long as there are any Luna bin packing pods running.

User data

userData allows you to define a script to be executed after nodes have been bootstraped.

For example specifying --set-string aws.userData="echo hello > /tmp/hello" will create a file named /tmp/hello with hello in it on the node once the EKS bootstrap script has completed.

If you have a large script we recommend you use the --set-file Helm option to load it:

$ cat myscript.sh
apt-get install my-package
$ ./deploy.sh ... --additional-helm-values "--set-file aws.userData=myscript.sh"

It is empty by default.

IMDS Metadata

metaData defines the instance metadata for EKS nodes, it’s a JSON document conforming to this specification.

Example:

{
"HttpEndpoint": "enabled",
"HttpProtocolIpv6": "disabled",
"HttpPutResponseHopLimit": 42,
"HttpTokens": "required",
"InstanceMetadataTagsState": "enabled"
}

Default: Empty.

Use --set-string or --set-file with Helm to set the instance metadata, --set will mangle in the input.

GCP

This section details GCP specific configuration options.

Image Type

By default, Luna allows GCP to select the image type for nodes Luna adds to the cluster. The option gcp.imageType can be used to instead have Luna specify the image type for its added nodes. GCP's default image type and its valid image type values are available via the following command:

gcloud container get-server-config

For example, if you would like to set the image type for Luna nodes to UBUNTU_CONTAINERD, do this:

--set gcp.imageType=UBUNTU_CONTAINERD

Disk Type

gcp.diskType specifies the type of disk to use on the nodes. The available options are: pd-standard, pd-ssd, or pd-balanced. By default, this parameter is empty, and when left empty, the disk type pd-balanced is used.

To set a specific disk type, use the following command:

--set gcp.diskType=pd-ssd

It’s important to note that not all instance types are compatible with the pd-standard disk type. If Luna selects C3 or G2 machine series and gcp.diskType is set to pd-standard, the node creation process will fail.

Node Service Account

By default, Node VMs access the google cloud platform using the default service account. The option gcp.nodeServiceAccount can be set to the email address of an alternative service account to be used by the Luna-allocated Node VMs.

For example, if you would like to set an alternative google cloud platform service account to be used by the Luna-allocated Node VMs, do this:

--set gcp.nodeServiceAccount=myemail@myproject.iam.gserviceaccount.com

Max Pods per Node

When gcp.maxPodsPerNode is 0 (the default), Luna sets the GCP default value of 110 on Luna-allocated nodes. When gcp.maxPodsPerNode is greater than 0, Luna sets the specified value (capped at 256, which is the GCP limit) on Luna-allocated nodes.

Network Tags

gcp.networkTags specifies the Network tags to add to the nodes. gcp.networkTags is a list of strings.

--set gcp.networkTags[0]=tag-value
--set gcp.networkTags[1]=other-tag-value

Empty by default.

GCE Instance Metadata

gcp.gceInstanceMetadata specifies the metadata to add to the GCE instance backing the Kubernetes node. gcp.gceInstanceMetadata is a dictionary.

--set gcp.gceInstanceMetadata.key1=value1
--set gcp.gceInstanceMetadata.key2=value2

Empty by default.

Bin Packing Zone Spread

When gcp.binPackingZoneSpread is true (default is false) on a regional GKE cluster, Luna supports placement of bin packing pods that specify zone spread. When this feature is enabled, Luna ensures there is a minimum of one bin packing node in each zone as long as there are bin packing pods running, giving kube-scheduler visibility into all zones.

Node Management: auto-upgrade and auto-repair

gcp.autoUpgrade and gcp.autoRepair define the node management services for the node pools. Both are true by default.

See the GKE documentation for NodeManagement for more information.

To disable auto-upgrade and auto-repair pass the following Helm values:

--set gcp.autoUpgrade=false
--set gcp.autoRepair=false

Note that to disable node auto-upgrade on node pools, the cluster must be configured to use static version instead of release channels. Otherwise node creation will fail.

Shielded Instance Configuration: secure boot and integrity monitoring

gcp.enableSecureBoot and gcp.enableIntegrityMonitoring configure the options controlling secure boot and integrity monitoring on the node pools. Both are false by default.

See the GKE documentation for ShieldedInstanceConfig for more information.

To enable secure boot and integrity monitoring pass the following Helm values:

--set gcp.enableSecureBoot=true
--set gcp.enableIntegrityMonitoring=true

Node Version

gcp.version specifies the version of Kubernetes to run on the Luna managed nodes. It is empty by default. When the version is not specified each node will be started with the same version of Kubernetes running on the control plane.

To get the list of available versions you can run the following command:

gcloud container get-server-config --format="yaml(validNodeVersions)"

When the gcp.version is not specified, the node will default to using the same version as the control plane. Consequently, if the control plane is updated, any existing node pools running older versions will no longer scale up. Instead, new node pools with the updated version will be created.

Ensure that the gcp.version you select is compatible with your cluster. Incompatibility will prevent Luna from successfully provisioning the nodes.

OAuth Scopes

gcp.oauthScopes specifies the set of Google API scopes to be made available on all of the node VMs under the "default" service account. It’s an array of strings. It is empty by default.

The specified scopes will be added to the built-in scopes. Built-in scopes are cluster type dependent. See the Google Kubernetes Engine documentation about OAuth Scopes to learn more.

For example to allow nodes to mount persistent storage and communicate with gcr.io add the following Helm values:

--set gcp.oauthScopes[0]=https://www.googleapis.com/auth/compute
--set gcp.oauthScopes[1]=https://www.googleapis.com/auth/devstorage.read_only

If you want to reset the gcp.oauthScopes parameter after it has been set, you have a few options:

  • Use --set gcp.oauthScopes=null during upgrades
  • Use --set-json=[] during upgrades
  • Set the parameter to an empty array in the values file

Azure

This section details Azure specific configuration options.

Pod Subnet for Dynamic Azure CNI Networking

You can indicate the pod subnet to be used by Dynamic Azure CNI networking for the bin packing node via azure.binPackingNodePodSubnet.

For example, if you would like your bin packing instances to use podsubnet1, do this:

--set azure.binPackingAKSPodSubnet=podsubnet1

Ephemeral OS Disk

You can indicate that Luna should use the ephemeral OS disk type, if Luna bin packing or bin selection chooses a node instance type that supports it and if that instance type has a cache size >= 30 GB (the minimum OS disk size for AKS), via azure.useEphemeralOsDisk. If Luna uses the ephemeral OS disk type, Luna will explicitly set the OS disk size to the node instance type cache size.

If azure.useEphemeralOsDisk is not set to true or if the node instance type Luna chooses does not support the ephemeral OS disk type or have a large enough cache, Luna will use the default OS disk type (managed).

To use this option, do this:

--set azure.useEphemeralOsDisk=true

Enable Node Public IP

You can indicate that Luna should enable AKS assignment of a public IP to the nodes it allocates via azure.enableNodePublicIP. The option is false by default.

To use this option, do this:

--set azure.enableNodePublicIP=true

OCI

This section details OCI specific configuration options.

Max Pods per Node

This option only applies to OKE clusters that use OCI_VCN_IP_NATIVE networking, and indicates how Luna should set max pods per node on nodes it allocates. If oci.maxPodsPerNode is 0 (default), Luna sets max pods per node to the maximum supported by compute shape vNICs. If oci.maxPodsPerNode is greater than 0, Luna sets max pods per node to min(maxPodsPerNode, maximum supported by compute shape vNICs).