Skip to main content
Version: v1.3

Capacity-based Scheduling

Overview

Using the policy-based scheduling mechanism, Nova provides for the capability to schedule a workload based on resource availability in a workload cluster. This means that a single Kubernetes resource or a group of Kubernetes resources will be placed on a target cluster that has sufficient capacity to host it.

If no target workload cluster has sufficient capacity for the requested placement, Nova chooses a target workload cluster that is running a cluster autoscaler, which can add cluster resources as needed. Nova currently recognizes the Kubernetes and the Luna cluster autoscalers. Workloads that may be handled by Luna need to be labeled for Luna management. For the simple case in which each Nova workload cluster runs at most one instance of Luna configured to use the default Luna management label, the Nova control plane supports the nova-scheduler option --luna-management-enabled (included by default) to automatically add the default Luna management label to incoming object pod specifications. For all other Luna configurations (e.g., non-default Luna management label or multiple instances of Luna per cluster), the workloads should be labeled for Luna management prior to being submitted for Nova placement.

Group Scheduling Based on Resource Availability Testing Example

In this example we will see how Nova groups k8s objects into a ScheduleGroup and finds a workload cluster for the whole group. Let's say you have a group of microservices, which combine into an application. We will try to create two versions of the same application: microservices labeled color: blue and the same set of microservices labeled color: green. By adding .groupBy.labelKey to the SchedulePolicy spec, Nova will create two ScheduleGroups: one with all objects with color: blue and another one with color: green label.

apiVersion: policy.elotl.co/v1alpha1
kind: SchedulePolicy
metadata:
...
spec:
groupBy:
labelKey: color
...

Each group will be considered separately by Nova, and it is guaranteed that all objects in the group will run in the same workload cluster. In this tutorial, we will let Nova figure out which workload cluster has enough resources to host each group. This can be done by not setting .spec.clusterSelector. We will use GCP Microservice Demo App which includes 10 different microservices. Total resources requested in this app is 1570 millicores of CPU and 1368 Mi of memory. Manifests used in this tutorial can be found in examples directory in try-nova repository. If you installed Nova from release tarball, you should have those manifests already.

We will first export these environment variables so that subsequent steps in this tutorial can be easily followed.

export NOVA_NAMESPACE=elotl
export NOVA_CONTROLPLANE_CONTEXT=nova
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2

Export this additional environment variable if you installed Nova using the tarball. You can optionally replace the value k8s-cluster-hosting-cp with the context name of your Nova hosting cluster.

export K8S_HOSTING_CLUSTER_CONTEXT=k8s-cluster-hosting-cp

Alternatively export these environment variables if you installed Nova using setup scripts provided in the try-nova repository.

export K8S_HOSTING_CLUSTER_CONTEXT=kind-hosting-cluster
export K8S_CLUSTER_CONTEXT_1=kind-wlc-1
export K8S_CLUSTER_CONTEXT_2=kind-wlc-2

Environment variable names with prefix NOVA_ refer to the custom resource Cluster in Nova. Cluster context names with prefix K8S_ refer to the underlying Kubernetes clusters.

In this tutorial we assume that your Nova Control Plane kubeconfig context is named nova.

Optional

In case you have chosen to rename your Nova Control Plane context, then, in order to follow this tutorial, either replace "--context=${NOVA_CONTROLPLANE_CONTEXT}" in each command with your Nova Control Plane kube context name, or rename this context, using following command:

kubectl config rename-context [yourname] nova

Let's start with checking the status of the workload clusters connected to the Nova Control Plane:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME    K8S-VERSION   K8S-CLUSTER   REGION        ZONE            READY   IDLE   STANDBY
wlc-1 1.32 nova-wlc-1 us-central1 us-central1-f True True False
wlc-2 1.32 nova-wlc-2 us-central1 us-central1-c True True False

NOTE: If your workload clusters are not named wlc-1 & wlc-2, please open examples/sample-group-scheduling/nginx-group-demo-ns.yaml in the text editor and edit them according to the instructions in this file.

NOTE: For example on GKE in order for this tutorial to work, you need a nodepool with 3 e2-medium nodes on each workload cluster.

  1. We first create the namespace that will be used in this example.
envsubst < "examples/sample-group-scheduling/nginx-group-demo-ns.yaml" > "./nginx-group-demo-ns.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./nginx-group-demo-ns.yaml
rm -f ./nginx-group-demo-ns.yaml

Ensure that the namespace nginx-group-demo is created in your control plane and both workload clusters.

  1. We now create a SchedulePolicy that will apply to all objects with the label nginxGroupScheduleDemo: "yes". This policy will also be grouping Kubernetes objects based on the value of the "color" label. Each group will be scheduled to any cluster that has enough resources to run them.
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f examples/sample-group-scheduling/policy.yaml

We check that the policy was created successfully:

kubectl get schedulepolicies                                                                        
NAME          AGE
demo-policy 4s
  1. Now, let's create green and blue instances of our app:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f examples/sample-group-scheduling/blue-app.yaml -n nginx-group-demo
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f examples/sample-group-scheduling/green-app.yaml -n nginx-group-demo
  1. Let's describe the blue-ngnix-deployment and view its events.

This will allow us to get visibility into the SchedulePolicy that will be used to place this object as well as info on the ScheduleGroup that the object has been placed into.

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} describe deployment blue-nginx-deployment -n nginx-group-demo
Name:                   blue-nginx-deployment
Namespace: nginx-group-demo
CreationTimestamp: Mon, 04 Aug 2025 10:47:45 -0700
Labels: app.kubernetes.io/instance=blue
app.kubernetes.io/managed-by=kubernetes
app.kubernetes.io/name=nginx
app.kubernetes.io/part-of=blue
app.kubernetes.io/version=1.7.9
color=blue
nginxGroupScheduleDemo=yes
nova.elotl.co/target-cluster=wlc-1
Annotations: <none>
Selector: app=nginx,color=blue

... SNIP ...

Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal AddedToScheduleGroup 60m nova-scheduler added to ScheduleGroup demo-policy-8f03f2ed which contains objects with groupBy.labelKey color=blue
Normal SchedulePolicyMatched 60m nova-scheduler schedule policy demo-policy was used to determine target cluster, for scheduling obj, blue-nginx-deployment
  1. You can check if two ScheduleGroups were created:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get schedulegroups
NAME                   AGE
demo-policy-1be06c9f 9m17s
demo-policy-69894944 6m15s
  1. novactl CLI provides a bit more context about Schedule Groups:
kubectl nova --context=${NOVA_CONTROLPLANE_CONTEXT} get schedulegroups
  NAME                 NOVA WORKLOAD CLUSTER                   NOVA POLICY NAME
------------------ -------------------------------------- --------------------------------------
demo-policy-1be06c9f wlc-1 demo-policy
demo-policy-69894944 wlc-2 demo-policy
------------------ -------------------------------------- --------------------------------------

From the output above, we can see which workload cluster is hosting each ScheduleGroup.

  1. Let's say we need to increase the resource (CPU or memory) request or replica count of one of the microservices in the second app, the green deployment. Meanwhile, there was other activity in the wlc-2 cluster as a result of which the increased resource request of the green deployment cannot be satisfied.

Nova will handle this situation by migrating the workload to another workload cluster that has sufficient resources. Let's watch this in action.

We simulate this scenario using a CPU hod pod in this examples/sample-group-scheduling/hog-pod.yaml manifest. We need to edit this manifest, so that the hog-pod will utilize almost all the resources in the cluster.

Nex, we create the hog pod in the same cluster where demo-policy-69894944 schedule group (green deployment) was scheduled (${NOVA_WORKLOAD_CLUSTER_2}.

envsubst < "examples/sample-group-scheduling/hog-pod.yaml" > "./hog-pod.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./hog-pod.yaml
cluster_name=$(kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get deployment -n nginx-group-demo green-nginx-deployment -o jsonpath='{.metadata.labels.nova\.elotl\.co/target-cluster}'); kubectl annotate pods --context=${NOVA_CONTROLPLANE_CONTEXT} -n default hog-pod nova.elotl.co/cluster=$cluster_name --overwrite
  1. Now let's increase the replica count of the green-nginx-deployment in the Nova control plane:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} scale deploy/green-nginx-deployment --replicas=3 -n nginx-group-demo
  1. If there are enough resources to satisfy new schedule group requirements (existing resource request for 2 nginx + increased replica count of green deployment), watching schedule group will show you schedule group being rescheduled to another cluster (in this example from wlc-2 to wlc-1):
kubectl nova --context=${NOVA_CONTROLPLANE_CONTEXT} get schedulegroups
  NAME                NOVA WORKLOAD CLUSTER                   NOVA POLICY NAME
------------------ -------------------------------------- --------------------------------------
demo-policy-1be06c9f wlc-1 demo-policy
demo-policy-69894944 wlc-1 demo-policy
------------------ -------------------------------------- --------------------------------------

NOTE: For example on GKE in order for this tutorial to work, you need to increase the number of nodes on the other workload cluster than green deployment was scheduled initially to 6. Otherwise rescheduling will not happen due to insufficient resources.

  1. We can wait for the green app to become available:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} wait --for condition=available deployment -n nginx-group-demo green-nginx-deployment --timeout=180s
  1. To understand why the ScheduleGroup was rescheduled, we can use:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} describe schedulegroups

and see the event message:

Name:                demo-policy-1be06c9f
Namespace:
Labels: color=green
nova.elotl.co/matching-policy=demo-policy
nova.elotl.co/target-cluster=wlc-1
Annotations: <none>
API Version: policy.elotl.co/v1alpha1
Kind: ScheduleGroup
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal ScheduleGroupSyncedToWorkloadCluster 22m (x5 over 122m) nova-scheduler Multiple clusters matching policy demo-policy (empty cluster selector): wlc-2,wlc-1; Picked cluster wlc-2 because it has enough resources;
Warning ReschedulingTriggered 3m12s nova-agent deployment nginx-group-demo/green-nginx-deployment does not have minimum replicas available
Normal ScheduleGroupSyncedToWorkloadCluster 108s nova-scheduler Multiple clusters matching policy demo-policy (empty cluster selector): wlc-2,wlc-1; Cluster wlc-2 skipped, does not have enough resources; Picked cluster wlc-1 because it has enough resources;

  1. We can verify that the green app is running by listing deployments in Nova Control Plane:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get deployments -n nginx-group-demo -l color=green
NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
green-nginx-deployment 3/3 3 3 24m

Undeploy workloads

To remove all objects created for this demo, remove nginx-group-demo namespace:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} delete ns nginx-group-demo
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} delete -f examples/sample-group-scheduling/policy.yaml
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} delete -f ./hog-pod.yaml
rm -f ./hog-pod.yaml