Version: v1.3

Just-in-time Standby Workload Cluster

Functional Overview

Nova optionally supports putting an idle workload cluster into standby state, to reduce resource costs in the cloud. When a standby workload cluster is needed to satisfy a Nova scheduling operation, the cluster is brought out of standby state. Nova can also optionally create additional cloud clusters, cloned from existing workload clusters, to satisfy the needs of policy-based or capacity-based scheduling.

Operational Description

If the environment variable NOVA_IDLE_ENTER_STANDBY_ENABLE is set true when the Nova control plane is deployed, the Nova-JIT Workload Cluster Standby feature is enabled. When the standby feature is enabled, a workload cluster that has been idle for 3600 secs (override via env var NOVA_IDLE_ENTER_STANDBY_SECS) is placed in standby state. An idle workload cluster is one on which no Nova-scheduled object that consumes resources is running, or all such objects belong to a namespace in the list specified in the environment variable NOVA_NAMESPACES_EXCLUDE_ACTIVE during nova agent installation. When Nova schedules an item to a workload cluster that is in standby state, the cluster is brought out of standby state.

We will first export these environment variables so that subsequent steps in this tutorial can be easily followed.

export NOVA_NAMESPACE=elotl
export NOVA_CONTROLPLANE_CONTEXT=nova
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2

Export these additional environment variables if you installed Nova using the tarball.

export K8S_HOSTING_CLUSTER_CONTEXT=k8s-cluster-hosting-cp
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2

Alternatively export these environment variables if you installed Nova using setup scripts provided in the try-nova repository.

export K8S_HOSTING_CLUSTER_CONTEXT=kind-hosting-cluster
export K8S_CLUSTER_CONTEXT_1=kind-wlc-1
export K8S_CLUSTER_CONTEXT_2=kind-wlc-2

You can configure Nova Control Plane to enable this feature, by using following patch manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nova-scheduler
  namespace: elotl # make sure this is the namespace used during installation
spec:
  template:
    spec:
      containers:
        - name: scheduler
          env:
            - name: NOVA_IDLE_ENTER_STANDBY_ENABLE
              value: "true"
            - name: NOVA_IDLE_ENTER_STANDBY_SECS
              value: "360"

Save this yaml to the scheduler-patch.yaml. Then, make sure your kube context is set to the hosting cluster (a cluster were Nova Control Plane components are running.). Then run

kubectl --context=${K8S_HOSTING_CLUSTER_CONTEXT} patch -n ${NOVA_NAMESPACE} deployment/nova-scheduler --patch-file ./scheduler-patch.yaml

Wait a bit, until Nova scheduler is restarted.

Suspend/Resume Standby Mode

In "suspend/resume" standby mode (default), all node groups/pools in a cluster in standby state are set to node count 0. This setting change causes removal of all cluster resources, except the hidden cloud provider control plane, in ~2 minutes. In standby, the status of all [non-Nova-scheduled] items (including the Nova agent) deployed in the cluster switches to pending. EKS, GKE, and standard-tier AKS clusters in standby state cost $0.10/hour. When the cluster exits standby, the node group/pool node counts are set back to their original values, which had been recorded by Nova in the cluster's custom resource object. This setting change causes the restoration of the cluster resources in ~2 minutes, allowing its pending items (including the Nova agent) to resume running as well as allowing Nova-scheduled items to be placed successfully.

Delete/Recreate Standby Mode

In "delete/recreate" standby mode (enabled via env var NOVA_DELETE_CLUSTER_IN_STANDBY set true), a workload cluster in standby state is completely deleted from the cloud, taking ~3-10 minutes. When the cluster exits standby, the cluster is recreated in the cloud, taking ~3-15 minutes, and the Nova agent objects are redeployed. The "delete/recreate" standby mode engenders greater cost savings than "suspend/resume", but the latencies to enter and exit standby state are significantly higher. Also note that "delete/recreate" standby mode is only supported on clusters comprised of simple group node configurations that can be represented by the Nova Cluster CRD NodeGroupConfig, i.e., by MaxSize, MinSize, DesiredCapacity, InstanceType, NodeGroupType, AccelType, AccelCount. If your clusters contain more complicated node group configurations or if they depend on your manually installing third-party software, please use "suspend/resume" standby mode.

With the "create" option (enabled via env var NOVA_CREATE_CLUSTER_IF_NEEDED set true), Nova creates a workload cluster via cloning an existing accessible (i.e., ready or can become ready via exiting standby) cluster to satisfy the needs of policy-based or capacity-based scheduling. The "create" option requires that "delete/recreate" standby mode be enabled, and created clusters can subsequently enter standby state. The number of clusters that Nova will create is limited to 10 (override via env var NOVA_MAX_CREATED_CLUSTERS). Automatic cluster creation depends on the Nova deployment containing a cluster appropriate for cloning, i.e., that there is an existing accessible cluster that satisfies the scheduling policy constraints and resource capacity needs of the placement, but mismatches either the policy's specified cluster name or the placement's needed resource availability.

Note that Nova with the "create" option enabled will not choose to create a cluster to satisfy resource availability if it detects any existing accessible candidate target clusters have cluster autoscaling enabled; instead it will choose an accessible autoscaled cluster. Nova's cluster autoscaling detection works for installations of Elotl Luna and of the Kubernetes Cluster Autoscaler.

You can enable Delete/Recreate standby mode feature, using this patch file:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nova-scheduler
  namespace: elotl # make sure this is the namespace used during installation
spec:
  template:
    spec:
      containers:
        - name: scheduler
          env:
            - name: NOVA_IDLE_ENTER_STANDBY_ENABLE
              value: "true"
            - name: NOVA_IDLE_ENTER_STANDBY_SECS
              value: "360"
            - name: NOVA_DELETE_CLUSTER_IN_STANDBY
              value: "true"
            - name: NOVA_CREATE_CLUSTER_IF_NEEDED
              value: "true"
# uncomment lines below if you're using kind clusters + nova-jit-helper (see "Kind operations" section below)
#           - name: KIND_HOST_ADDRESS
#             value: "172.17.0.1"


# For workload cluster running in Public Cloud Providers, check Cloud Operations section in this doc
# for workload clusters running in EKS (Amazon Web Services), you need to fill following values:
#           - name: AWS_ACCESS_KEY_ID
#             value: "" #-- Set to access key id for AWS account for AWS workload cluster standby
#           - name: AWS_SECRET_ACCESS_KEY
#             value: "" # -- Set to secret access key for AWS account for AWS workload cluster standby

# for workload clusters running in GKE (Google Cloud), you need to pass following credentials:
#           - name: GCE_PROJECT_ID
#             value: "" #-- Set to project id of GCE account for GCE workload cluster standby
#           - name: GCE_ACCESS_KEY
#             value: "" # -- Set to base64 encoding of GCE service account json file for GCE workload cluster standby

# for workload clusters running in AKS (Azure), you need to pass following credentials:
#           - name: AZURE_TENANT_ID
#             value: "" # -- Set to tenant id of Azure account for AZURE workload cluster standby
#           - name: AZURE_CLIENT_ID
#             value: "" # -- Set to client id of Azure app registration w/permission to scale, create, and delete clusters
#           - name: AZURE_CLIENT_SECRET
#             value: "" # -- Set to client secret associated with AZURE_CLIENT_ID

Save this yaml to the scheduler-patch-delete-recreate.yaml. Then, make sure your kube context is set to the hosting cluster (a cluster were Nova Control Plane components are running.). Then run

kubectl --context=${K8S_HOSTING_CLUSTER_CONTEXT} patch -n ${NOVA_NAMESPACE} deployment/nova-scheduler --patch-file ./scheduler-patch-delete-recreate.yaml

Wait a bit, until Nova scheduler is restarted.

Kind Operations

Nova JIT cluster delete/recreate standby and cluster clone/create can be run locally on kind clusters. In this section, we walk through examples of each.

Please clone our try-nova repository and download the appropriate release tarball from https://www.elotl.co/free-trial.html.

Start the Nova JIT helper tool, located in the top-level directory of the downloaded tarball, in a separate terminal. This tool executes "cloud" operations on your local kind clusters, including cluster deletion and creation. When the tool starts up, it is in silent listening mode; it will begin printing messages to the terminal when it starts receiving requests related to Nova JIT operations.

./tarball-extracted-directory/nova-jit-helper # use version for your OS and processor

You need to enable Nova JIT at deployment time. Teardown any existing non-JIT deployment of the Nova trial sandbox:

./scripts/teardown_kind_cluster.sh

You can double-check that there are no running kind clusters:

kind get clusters

    No kind clusters found.

Set the following environment variables to enable standby in delete/recreate mode with enter-standby set to 120 secs and to enable cluster create/clone:

export NOVA_IDLE_ENTER_STANDBY_ENABLE="true"
export NOVA_DELETE_CLUSTER_IN_STANDBY="true"
export NOVA_IDLE_ENTER_STANDBY_SECS="120"
export NOVA_CREATE_CLUSTER_IF_NEEDED="true"

After ensuring the appropriate novactl binary is installed, setup env variables and deploy Nova:

export NOVA_NAMESPACE=elotl
export NOVA_CONTROLPLANE_CONTEXT=nova
export K8S_CLUSTER_CONTEXT_1=k8s-cluster-1
export K8S_CLUSTER_CONTEXT_2=k8s-cluster-2
export K8S_HOSTING_CLUSTER_CONTEXT=k8s-cluster-hosting-cp
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2
export NOVA_WORKLOAD_CLUSTER_3=wlc-3
export K8S_HOSTING_CLUSTER=hosting-cluster

./scripts/setup_trial_env_on_kind.sh

After the Nova deployment is fully initialized, you will see that both workload clusters are ready, idle, and not in standby:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         True    True   False
    wlc-2   1.28          wlc-2                         True    True   False

And you can see the kind clusters backing the nova cp and the workload clusters:

kind get clusters

    hosting-cluster
    wlc-1
    wlc-2

After NOVA_IDLE_ENTER_STANDBY_SECS seconds have elapsed, the workload clusters enter standby, and after an additional short period they are no longer reported as ready:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         False   True   True
    wlc-2   1.28          wlc-2                         False   True   True

And the kind clusters backing the workload clusters are deleted:

kind get clusters

    hosting-cluster

At this point, you can bring the clusters out of standby by, e.g., scheduling spread workloads that need both workload clusters:

envsubst < "examples/sample-spread-scheduling/busybox.yaml" > "./busybox.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./busybox.yaml
rm -f ./busybox.yaml

    deployment.apps/busybox created

You can almost immediately see that the workload clusters are no longer idle and no longer considered to be in standby, but they are not yet ready:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         False   False  False
    wlc-2   1.28          wlc-2                         False   False  False

After the workload clusters are recreated and the nova agent software is reinstalled, they become ready:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         True    False  False
    wlc-2   1.28          wlc-2                         True    False  False

And the busybox spread workloads are successfully scheduled:

    kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get all --all-namespaces

    NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    default     service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   6m34s

    NAMESPACE   NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
    default     deployment.apps/busybox   4/2     4            4           2m51s

And you can see the kind workload clusters have been recreated:

kind get clusters

    hosting-cluster
    wlc-1
    wlc-2

If you want to access the recreated workload clusters directly, you can generate kubeconfigs for them:

kind get kubeconfig --name=wlc-1 > workload-1.config
kind get kubeconfig --name=wlc-2 > workload-2.config

And then you can check them directly:

KUBECONFIG=./workload-1.config kubectl get all

    NAME                        READY   STATUS    RESTARTS   AGE
    pod/busybox-66f46bc-qfsq8   1/1     Running   0          3m27s
    pod/busybox-66f46bc-vbm44   1/1     Running   0          3m27s

    NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   3m56s

    NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/busybox   2/2     2            2           3m27s

    NAME                              DESIRED   CURRENT   READY   AGE
    replicaset.apps/busybox-66f46bc   2         2         2       3m27s

    KUBECONFIG=./workload-2.config kubectl get all
    NAME                        READY   STATUS    RESTARTS   AGE
    pod/busybox-66f46bc-m5n82   1/1     Running   0          5m23s
    pod/busybox-66f46bc-tq6rg   1/1     Running   0          5m23s

    NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   5m53s

    NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
    deployment.apps/busybox   2/2     2            2           5m23s

    NAME                              DESIRED   CURRENT   READY   AGE
    replicaset.apps/busybox-66f46bc   2         2         2       5m23s

So we've just seen an example of Nova JIT cluster delete/recreate standby. Next let's look at an example of Nova JIT cluster create/clone. Nova JIT cluster create/clone creates a new cluster if needed to satisfy a Nova policy. Such policies apply to a number of use cases, including resource availability, K8s upgrade, and resource isolation.

In our example, we define a policy created to allocate a cluster to isolate a new customer's workloads. The policy indicates that workloads matching namespace "namespace-customer3" should be placed on cluster wlc-3, which does not currently exist:

envsubst < "examples/sample-policy/schedule_policy_namespace.yaml" > "./schedule_policy_namespace.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./schedule_policy_namespace.yaml
rm -f ./schedule_policy_namespace.yaml

    schedulepolicy.policy.elotl.co/trial-policy-customer1 created

We then deploy the namespace and a workload pod pod-customer3 in that namespace.

envsubst < "examples/sample-workloads/pod_namespace.yaml" > "./pod_namespace.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./pod_namespace.yaml
rm -f ./pod_namespace.yaml

    namespace/namespace-customer3 created
    pod/pod-customer3 created

Nova JIT begins the process of creating wlc-3 to accommodate the placement:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         True    False  False
    wlc-2   1.28          wlc-2                         True    False  False
    wlc-3   1.28          wlc-3                         False   False  False

And after the cluster is created and the Nova agent is installed on it, it becomes ready for use:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters

    NAME    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
    wlc-1   1.28          wlc-1                         True    False  False
    wlc-2   1.28          wlc-2                         True    False  False
    wlc-3   1.28          wlc-3                         True    False  False

You can see the new kind cluster that is backing the new K8s cluster:

kind get clusters

    hosting-cluster
    wlc-1
    wlc-2
    wlc-3

And you can see the new pod running via Nova control plane:

kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get all --all-namespaces

    NAMESPACE             NAME                READY   STATUS    RESTARTS   AGE
    namespace-customer3   pod/pod-customer3   1/1     Running   0          88s

    NAMESPACE   NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
    default     service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   13m

    NAMESPACE   NAME                      READY   UP-TO-DATE   AVAILABLE   AGE
    default     deployment.apps/busybox   4/2     4            4           9m27s

You can generate a kubeconfig for the newly created cluster:

kind get kubeconfig --name=wlc-3 > workload-3.config

And you can use it to check the target workload cluster directly:

KUBECONFIG=./workload-3.config kubectl get all -n namespace-customer3

    NAME                READY   STATUS    RESTARTS   AGE
    pod/pod-customer3   1/1     Running   0          3m11s

At this point, you've completed the JIT examples. Please run the following to remove your Nova control plane and workload clusters:

./scripts/teardown_kind_cluster.sh  # removes the initially-deployed kind clusters hosting-cluster, wlc-1, wlc-2
kind delete cluster --name wlc-3    # removes the additional wlc-3 kind cluster created by Nova JIT

And you can stop the nova-jit-helper by entering CTL-C in its terminal window.

Cloud Operations

Cloud Account Information

For Nova JIT to perform cloud operations, including getting/setting node group/pool configurations and deleting/recreating/creating clusters and node groups/pools, it requires the information needed to use a cloud account with the appropriate permissions.

For EKS, eksctl is used, which supports access to both managed and unmanaged node groups. The eksctl credentials are passed in the following environment variables, which should be set when the Nova control plane is deployed:

AWS_ACCESS_KEY_ID -- Set to access key id for AWS account for AWS workload cluster standby
AWS_SECRET_ACCESS_KEY -- Set to secret access key for AWS account for AWS workload cluster standby

For GKE, gcloud is used; the following environment variables should be set when the Nova control plane is deployed:

GCE_PROJECT_ID -- Set to project id of GCE account for GCE workload cluster standby
GCE_ACCESS_KEY -- Set to base64 encoding of GCE service account json file for GCE workload cluster standby

For AKS, az is used; the following environment variables should be set when the Nova control plane is deployed:

AZURE_TENANT_ID -- Set to tenant id of Azure account for AZURE workload cluster standby
AZURE_CLIENT_ID -- Set to client id of Azure app registration w/permission to scale, create, and delete clusters
AZURE_CLIENT_SECRET -- Set to client secret associated with AZURE_CLIENT_ID

Accessing Recreated or Clone-created Clusters

To externally access clusters recreated or clone-created by Nova, a new context config must be created.

For GKE, obtaining the config for the recreated cluster can be done via:
- gcloud container clusters get-credentials k8s-cluster-name --zone zone-name --project gce-project-name
For EKS, obtaining the config for the recreated cluster can be done via:
- eksctl utils write-kubeconfig --cluster=k8s-cluster-name --region region-name
For AKS, obtaining the config for the recreated cluster can be done via:
- az aks get-credentials --resource-group resource_group_name --name cluster_name --overwrite-existing
For KIND, obtaining the config for the recreated cluster can be done via:
- kind get kubeconfig --name=k8s-cluster-name >k8s-cluster-name.config

Troubleshooting

Logs and Commands

The Nova control plane logs report various information on JIT clusters operations.

For long-running cloud operations, it can be useful to obtain detailed information directly from cloud APIs.

For EKS, useful commands include:
- eksctl get cluster --name k8s-cluster-name --region region-name
- eksctl get nodegroup --cluster k8s-cluster-name --region region-name
For GKE, useful commands include:
- gcloud container clusters describe k8s-cluster-name --zone zone-name

Known issues

EKS cluster deletion can sometimes fail; please see https://aws.amazon.com/premiumsupport/knowledge-center/eks-delete-cluster-issues/ for more information.

AKS system nodepools cannot be scaled down to 0; "suspend/resume" standby mode scales them down to 1.

AKS cluster and resource group names should not include underscores, other than that resourceGroupName can be clusterName_group.

Just-in-time Standby Workload Cluster

Functional Overview​

Operational Description​

Suspend/Resume Standby Mode​

Delete/Recreate Standby Mode​

Kind Operations​

Cloud Operations​

Cloud Account Information​

Accessing Recreated or Clone-created Clusters​

Troubleshooting​

Logs and Commands​

Known issues​