Version: v0.7.1

Capacity-based Scheduling

Overview

Using the policy-based scheduling mechanism, Nova provides for the capability to schedule a workload based on resource availability in a workload cluster. This means that a single Kubernetes resource or a group of Kubernetes resources will be placed on a target cluster that has sufficient capacity to host it.

Group Scheduling Based on Resource Availability Testing Example

In this example we will see how Nova groups k8s objects into a ScheduleGroup and finds a workload cluster for the whole group. Let's say you have a group of microservices, which combine into an application. We will try to create two versions of the same application: microservices labeled color: blue and the same set of microservices labeled color: green. By adding .groupBy.labelKey to the SchedulePolicy spec, Nova will create two ScheduleGroups: one with all objects with color: blue and another one with color: green label.

apiVersion: policy.elotl.co/v1alpha1
kind: SchedulePolicy
metadata:
   ...
spec:
  groupBy:
    labelKey: color
  ...

Each group will be considered separately by Nova, and it is guaranteed that all objects in the group will run in the same workload cluster. In this tutorial, we will let Nova figure out which workload cluster has enough resources to host each group. This can be done by not setting .spec.clusterSelector. We will use GCP Microservice Demo App which includes 10 different microservices. Total resources requested in this app is 1570 millicores of CPU and 1368 Mi of memory. Manifests used in this tutorial can be found in examples directory in try-nova repository. If you installed Nova from release tarball, you should have those manifests already.

In this tutorial we assume that your Nova Control Plane kube config context is named nova. To follow this tutorial, either replace "--context=nova" in each command with your Nova Control Plane kube context name, or rename this context, using following command:

kubectl config rename-context [yourname] nova

Let's start with checking the status of workload cluster connected to the Nova Control Plane:

kubectl --context=nova get clusters

alternatively, you can look up available resources in cluster nodes, using our CLI:

kubectl nova --context=nova get clusters

NAME                    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
my-workload-cluster-1   1.22          workload-1                    True    True   False
my-workload-cluster-2   1.22          workload-2                    True    True   False

and then create a namespace that we will use:

NOTE: If your workload clusters are not named kind-workload-1 & kind-workload-2 please open examples/sample-group-scheduling/nginx-group-demo-ns.yaml in the text editor and edit them accordingly to the instructions in the files.

kubectl --context=nova apply -f examples/sample-group-scheduling/nginx-group-demo-ns.yaml

kubectl --context=nova apply -f examples/sample-group-scheduling/policy.yaml

This policy is saying, for any objects with label microServicesDemo: "yes", group them based on the value of the "color" label and schedule a group to any cluster which has enough resources to run them. 3. Now, let's create green and blue instances of our app:

kubectl --context=nova apply -f examples/sample-group-scheduling/blue-app.yaml -n nginx-group-demo

kubectl --context=nova apply -f examples/sample-group-scheduling/green-app.yaml -n nginx-group-demo

Verifying whether the objects were assigned to the correct ScheduleGroup can be done by describing an object and looking at events:

kubectl --context=nova describe deployment blue-nginx-deployment -n nginx-group-demo

Name:                   blue-nginx-deployment
Namespace:              nginx-group-demo
CreationTimestamp:      Fri, 11 Aug 2023 14:19:57 -0500
Labels:                 app.kubernetes.io/instance=blue
                        app.kubernetes.io/managed-by=kubernetes
                        app.kubernetes.io/name=nginx
                        app.kubernetes.io/part-of=blue
                        app.kubernetes.io/version=1.7.9
                        color=blue
                        nginxGroupScheduleDemo=yes
                        nova.elotl.co/target-cluster=kind-workload-2
Selector:               app=nginx,color=blue
Events:
Type    Reason                 Age    From            Message
----    ------                 ----   ----            -------
Normal  AddedToScheduleGroup   2m29s  nova-scheduler  added to ScheduleGroup demo-policy-69894944 which contains objects with groupBy.labelKey color=blue
Normal  SchedulePolicyMatched  2m29s  nova-scheduler  schedule policy demo-policy will be used to determine target cluster

You can check if two ScheduleGroups were created:

kubectl --context=nova get schedulegroups

NAME                   AGE
demo-policy-1be06c9f   9m17s
demo-policy-69894944   6m15s

novactl CLI provides a bit more context about Schedule Groups:

kubectl nova --context=nova get schedulegroups

  NAME                 NOVA WORKLOAD CLUSTER                   NOVA POLICY NAME
------------------  --------------------------------------  --------------------------------------
demo-policy-1be06c9f   kind-workload-2                         demo-policy
demo-policy-69894944   kind-workload-2                         demo-policy
------------------  --------------------------------------  --------------------------------------

From the output above, we can see which workload cluster is hosting each ScheduleGroup.
Now, imagine you need to increase resource request or replica count on one of the microservices in the second app. In the meantime, there was other activity in the cluster and after your update there won't be enough resources in the cluster to satisfy your update. You can simulate this scenario using examples/sample-group-scheduling/hog-pod.yaml manifest. You should edit it, so the hog-pod will take almost all resources in your cluster. Now, you can apply it to the same cluster where demo-policy-69894944 schedule group was scheduled (kind-workload-2 in my case).

kubectl --context=nova apply -f examples/sample-group-scheduling/hog-pod.yaml
cluster_name=$(kubectl --context=nova get deployment -n nginx-group-demo green-nginx-deployment -o jsonpath='{.metadata.labels.nova\.elotl\.co/target-cluster}'); kubectl annotate pods --context=nova -n default hog-pod nova.elotl.co/cluster=$cluster_name --overwrite

Now let's increase replica count in green-app in the Nova control plane:

kubectl --context=nova scale deploy/green-nginx-deployment --replicas=3 -n nginx-group-demo

If there is enough resources to satisfy new schedule group requirements (existing resource request for 2 nginx + increased replica count of green deployment), watching schedule group will show you schedule group being rescheduled to another cluster:

kubectl nova --context=nova get schedulegroups

  NAME                NOVA WORKLOAD CLUSTER                   NOVA POLICY NAME
------------------  --------------------------------------  --------------------------------------
demo-policy-1be06c9f  kind-workload-1                         demo-policy
demo-policy-69894944  kind-workload-2                         demo-policy
------------------  --------------------------------------  --------------------------------------

We can wait for green app being available:

kubectl --context=nova wait --for condition=available deployment -n nginx-group-demo green-nginx-deployment --timeout=90s

To understand why the ScheduleGroup was rescheduled, we can use:

kubectl --context=nova describe schedulegroups

and see the event message:

Name:                demo-policy-1be06c9f
Namespace:
Labels:             color=green
                    nova.elotl.co/matching-policy=demo-policy
                    nova.elotl.co/target-cluster=kind-workload-1
Annotations:         <none>
API Version:         policy.elotl.co/v1alpha1
Kind:                ScheduleGroup
...
Events:
Type     Reason                                Age                  From            Message
----     ------                                ----                 ----            -------
Normal   ScheduleGroupSyncedToWorkloadCluster  22m (x5 over 122m)   nova-scheduler  Multiple clusters matching policy demo-policy (empty cluster selector): kind-workload-2,kind-workload-1; Picked cluster kind-workload-2 because it has enough resources;
Warning  ReschedulingTriggered                 3m12s                nova-agent      deployment nginx-group-demo/green-nginx-deployment does not have minimum replicas available
Normal   ScheduleGroupSyncedToWorkloadCluster  108s                 nova-scheduler  Multiple clusters matching policy demo-policy (empty cluster selector): kind-workload-2,kind-workload-1; Cluster kind-workload-2 skipped, does not have enough resources; Picked cluster kind-workload-1 because it has enough resources;

You can verify that green app is running by listing deployment in Nova Control Plane:

kubectl --context=nova get deployments -n nginx-group-demo -l color=green

NAME                     READY   UP-TO-DATE   AVAILABLE   AGE
green-nginx-deployment   3/3     3            3           24m

To remove all objects created for this demo, remove nginx-group-demo namespace:

kubectl --context=nova delete ns nginx-group-demo

kubectl --context=nova delete -f examples/sample-group-scheduling/policy.yaml

kubectl --context=nova delete -f examples/sample-group-scheduling/hog-pod.yaml

Capacity-based Scheduling

Overview​

Group Scheduling Based on Resource Availability Testing Example​

Overview

Group Scheduling Based on Resource Availability Testing Example