Just-in-time Standby Workload Cluster
Functional Overview
Nova optionally supports putting an idle workload cluster into standby state, to reduce resource costs in the cloud. When a standby workload cluster is needed to satisfy a Nova scheduling operation, the cluster is brought out of standby state. Nova can also optionally create additional cloud clusters, cloned from existing workload clusters, to satisfy the needs of policy-based or capacity-based scheduling.
Operational Description
If the environment variable NOVA_IDLE_ENTER_STANDBY_ENABLE is set true when the Nova control plane is deployed, the Nova-JIT Workload Cluster Standby feature is enabled. When the standby feature is enabled, a workload cluster that has been idle for 3600 secs (override via env var NOVA_IDLE_ENTER_STANDBY_SECS) is placed in standby state. An idle workload cluster is one on which no Nova-scheduled object that consumes resources is running, or all such objects belong to a namespace in the list specified in the environment variable NOVA_NAMESPACES_EXCLUDE_ACTIVE during nova agent installation. When Nova schedules an item to a workload cluster that is in standby state, the cluster is brought out of standby state.
We will first export these environment variables so that subsequent steps in this tutorial can be easily followed.
export NOVA_NAMESPACE=elotl
export NOVA_CONTROLPLANE_CONTEXT=nova
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2
Export these additional environment variables if you installed Nova using the tarball.
export K8S_HOSTING_CLUSTER_CONTEXT=k8s-cluster-hosting-cp
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2
Alternatively export these environment variables if you installed Nova using setup scripts provided in the try-nova repository.
export K8S_HOSTING_CLUSTER_CONTEXT=kind-hosting-cluster
export K8S_CLUSTER_CONTEXT_1=kind-wlc-1
export K8S_CLUSTER_CONTEXT_2=kind-wlc-2
You can configure Nova Control Plane to enable this feature, by using following patch manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nova-scheduler
namespace: elotl # make sure this is the namespace used during installation
spec:
template:
spec:
containers:
- name: scheduler
env:
- name: NOVA_IDLE_ENTER_STANDBY_ENABLE
value: "true"
- name: NOVA_IDLE_ENTER_STANDBY_SECS
value: "360"
Save this yaml to the scheduler-patch.yaml
. Then, make sure your kube context is set to the hosting cluster (a cluster were Nova Control Plane components are running.).
Then run
kubectl --context=${K8S_HOSTING_CLUSTER_CONTEXT} patch -n ${NOVA_NAMESPACE} deployment/nova-scheduler --patch-file ./scheduler-patch.yaml
Wait a bit, until Nova scheduler is restarted.
Suspend/Resume Standby Mode
In "suspend/resume" standby mode (default), all node groups/pools in a cluster in standby state are set to node count 0. This setting change causes removal of all cluster resources, except the hidden cloud provider control plane, in ~2 minutes. In standby, the status of all [non-Nova-scheduled] items (including the Nova agent) deployed in the cluster switches to pending. EKS, GKE, and standard-tier AKS clusters in standby state cost $0.10/hour. When the cluster exits standby, the node group/pool node counts are set back to their original values, which had been recorded by Nova in the cluster's custom resource object. This setting change causes the restoration of the cluster resources in ~2 minutes, allowing its pending items (including the Nova agent) to resume running as well as allowing Nova-scheduled items to be placed successfully.
Delete/Recreate Standby Mode
In "delete/recreate" standby mode (enabled via env var NOVA_DELETE_CLUSTER_IN_STANDBY set true), a workload cluster in standby state is completely deleted from the cloud, taking ~3-10 minutes. When the cluster exits standby, the cluster is recreated in the cloud, taking ~3-15 minutes, and the Nova agent objects are redeployed. The "delete/recreate" standby mode engenders greater cost savings than "suspend/resume", but the latencies to enter and exit standby state are significantly higher. Also note that "delete/recreate" standby mode is only supported on clusters comprised of simple group node configurations that can be represented by the Nova Cluster CRD NodeGroupConfig, i.e., by MaxSize, MinSize, DesiredCapacity, InstanceType, NodeGroupType, AccelType, AccelCount. If your clusters contain more complicated node group configurations or if they depend on your manually installing third-party software, please use "suspend/resume" standby mode.
With the "create" option (enabled via env var NOVA_CREATE_CLUSTER_IF_NEEDED set true), Nova creates a workload cluster via cloning an existing accessible (i.e., ready or can become ready via exiting standby) cluster to satisfy the needs of policy-based or capacity-based scheduling. The "create" option requires that "delete/recreate" standby mode be enabled, and created clusters can subsequently enter standby state. The number of clusters that Nova will create is limited to 10 (override via env var NOVA_MAX_CREATED_CLUSTERS). Automatic cluster creation depends on the Nova deployment containing a cluster appropriate for cloning, i.e., that there is an existing accessible cluster that satisfies the scheduling policy constraints and resource capacity needs of the placement, but mismatches either the policy's specified cluster name or the placement's needed resource availability.
Note that Nova with the "create" option enabled will not choose to create a cluster to satisfy resource availability if it detects any existing accessible candidate target clusters have cluster autoscaling enabled; instead it will choose an accessible autoscaled cluster. Nova's cluster autoscaling detection works for installations of Elotl Luna and of the Kubernetes Cluster Autoscaler.
You can enable Delete/Recreate standby mode feature, using this patch file:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nova-scheduler
namespace: elotl # make sure this is the namespace used during installation
spec:
template:
spec:
containers:
- name: scheduler
env:
- name: NOVA_IDLE_ENTER_STANDBY_ENABLE
value: "true"
- name: NOVA_IDLE_ENTER_STANDBY_SECS
value: "360"
- name: NOVA_DELETE_CLUSTER_IN_STANDBY
value: "true"
- name: NOVA_CREATE_CLUSTER_IF_NEEDED
value: "true"
# uncomment lines below if you're using kind clusters + nova-jit-helper (see "Kind operations" section below)
# - name: KIND_HOST_ADDRESS
# value: "172.17.0.1"
# For workload cluster running in Public Cloud Providers, check Cloud Operations section in this doc
# for workload clusters running in EKS (Amazon Web Services), you need to fill following values:
# - name: AWS_ACCESS_KEY_ID
# value: "" #-- Set to access key id for AWS account for AWS workload cluster standby
# - name: AWS_SECRET_ACCESS_KEY
# value: "" # -- Set to secret access key for AWS account for AWS workload cluster standby
# for workload clusters running in GKE (Google Cloud), you need to pass following credentials:
# - name: GCE_PROJECT_ID
# value: "" #-- Set to project id of GCE account for GCE workload cluster standby
# - name: GCE_ACCESS_KEY
# value: "" # -- Set to base64 encoding of GCE service account json file for GCE workload cluster standby
# for workload clusters running in AKS (Azure), you need to pass following credentials:
# - name: AZURE_TENANT_ID
# value: "" # -- Set to tenant id of Azure account for AZURE workload cluster standby
# - name: AZURE_CLIENT_ID
# value: "" # -- Set to client id of Azure app registration w/permission to scale, create, and delete clusters
# - name: AZURE_CLIENT_SECRET
# value: "" # -- Set to client secret associated with AZURE_CLIENT_ID
Save this yaml to the scheduler-patch-delete-recreate.yaml
. Then, make sure your kube context is set to the hosting cluster (a cluster were Nova Control Plane components are running.).
Then run
kubectl --context=${K8S_HOSTING_CLUSTER_CONTEXT} patch -n ${NOVA_NAMESPACE} deployment/nova-scheduler --patch-file ./scheduler-patch-delete-recreate.yaml
Wait a bit, until Nova scheduler is restarted.
Kind Operations
Nova JIT cluster delete/recreate standby and cluster clone/create can be run locally on kind clusters. In this section, we walk through examples of each.
Please clone our try-nova repository and download the appropriate release tarball from https://www.elotl.co/free-trial.html.
Start the Nova JIT helper tool, located in the top-level directory of the downloaded tarball, in a separate terminal. This tool executes "cloud" operations on your local kind clusters, including cluster deletion and creation. When the tool starts up, it is in silent listening mode; it will begin printing messages to the terminal when it starts receiving requests related to Nova JIT operations.
./tarball-extracted-directory/nova-jit-helper # use version for your OS and processor
You need to enable Nova JIT at deployment time. Teardown any existing non-JIT deployment of the Nova trial sandbox:
./scripts/teardown_kind_cluster.sh
You can double-check that there are no running kind clusters:
kind get clusters
No kind clusters found.
Set the following environment variables to enable standby in delete/recreate mode with enter-standby set to 120 secs and to enable cluster create/clone:
export NOVA_IDLE_ENTER_STANDBY_ENABLE="true"
export NOVA_DELETE_CLUSTER_IN_STANDBY="true"
export NOVA_IDLE_ENTER_STANDBY_SECS="120"
export NOVA_CREATE_CLUSTER_IF_NEEDED="true"
After ensuring the appropriate novactl binary is installed, setup env variables and deploy Nova:
export NOVA_NAMESPACE=elotl
export NOVA_CONTROLPLANE_CONTEXT=nova
export K8S_CLUSTER_CONTEXT_1=k8s-cluster-1
export K8S_CLUSTER_CONTEXT_2=k8s-cluster-2
export K8S_HOSTING_CLUSTER_CONTEXT=k8s-cluster-hosting-cp
export NOVA_WORKLOAD_CLUSTER_1=wlc-1
export NOVA_WORKLOAD_CLUSTER_2=wlc-2
export NOVA_WORKLOAD_CLUSTER_3=wlc-3
export K8S_HOSTING_CLUSTER=hosting-cluster
./scripts/setup_trial_env_on_kind.sh
After the Nova deployment is fully initialized, you will see that both workload clusters are ready, idle, and not in standby:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 True True False
wlc-2 1.28 wlc-2 True True False
And you can see the kind clusters backing the nova cp and the workload clusters:
kind get clusters
hosting-cluster
wlc-1
wlc-2
After NOVA_IDLE_ENTER_STANDBY_SECS seconds have elapsed, the workload clusters enter standby, and after an additional short period they are no longer reported as ready:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 False True True
wlc-2 1.28 wlc-2 False True True
And the kind clusters backing the workload clusters are deleted:
kind get clusters
hosting-cluster
At this point, you can bring the clusters out of standby by, e.g., scheduling spread workloads that need both workload clusters:
envsubst < "examples/sample-spread-scheduling/busybox.yaml" > "./busybox.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./busybox.yaml
rm -f ./busybox.yaml
deployment.apps/busybox created
You can almost immediately see that the workload clusters are no longer idle and no longer considered to be in standby, but they are not yet ready:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 False False False
wlc-2 1.28 wlc-2 False False False
After the workload clusters are recreated and the nova agent software is reinstalled, they become ready:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 True False False
wlc-2 1.28 wlc-2 True False False
And the busybox spread workloads are successfully scheduled:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get all --all-namespaces
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 6m34s
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default deployment.apps/busybox 4/2 4 4 2m51s
And you can see the kind workload clusters have been recreated:
kind get clusters
hosting-cluster
wlc-1
wlc-2
If you want to access the recreated workload clusters directly, you can generate kubeconfigs for them:
kind get kubeconfig --name=wlc-1 > workload-1.config
kind get kubeconfig --name=wlc-2 > workload-2.config
And then you can check them directly:
KUBECONFIG=./workload-1.config kubectl get all
NAME READY STATUS RESTARTS AGE
pod/busybox-66f46bc-qfsq8 1/1 Running 0 3m27s
pod/busybox-66f46bc-vbm44 1/1 Running 0 3m27s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 3m56s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/busybox 2/2 2 2 3m27s
NAME DESIRED CURRENT READY AGE
replicaset.apps/busybox-66f46bc 2 2 2 3m27s
KUBECONFIG=./workload-2.config kubectl get all
NAME READY STATUS RESTARTS AGE
pod/busybox-66f46bc-m5n82 1/1 Running 0 5m23s
pod/busybox-66f46bc-tq6rg 1/1 Running 0 5m23s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 5m53s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/busybox 2/2 2 2 5m23s
NAME DESIRED CURRENT READY AGE
replicaset.apps/busybox-66f46bc 2 2 2 5m23s
So we've just seen an example of Nova JIT cluster delete/recreate standby. Next let's look at an example of Nova JIT cluster create/clone. Nova JIT cluster create/clone creates a new cluster if needed to satisfy a Nova policy. Such policies apply to a number of use cases, including resource availability, K8s upgrade, and resource isolation.
In our example, we define a policy created to allocate a cluster to isolate a new customer's workloads. The policy indicates that workloads matching namespace "namespace-customer3" should be placed on cluster wlc-3, which does not currently exist:
envsubst < "examples/sample-policy/schedule_policy_namespace.yaml" > "./schedule_policy_namespace.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./schedule_policy_namespace.yaml
rm -f ./schedule_policy_namespace.yaml
schedulepolicy.policy.elotl.co/trial-policy-customer1 created
We then deploy the namespace and a workload pod pod-customer3 in that namespace.
envsubst < "examples/sample-workloads/pod_namespace.yaml" > "./pod_namespace.yaml"
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} apply -f ./pod_namespace.yaml
rm -f ./pod_namespace.yaml
namespace/namespace-customer3 created
pod/pod-customer3 created
Nova JIT begins the process of creating wlc-3 to accommodate the placement:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 True False False
wlc-2 1.28 wlc-2 True False False
wlc-3 1.28 wlc-3 False False False
And after the cluster is created and the Nova agent is installed on it, it becomes ready for use:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
wlc-1 1.28 wlc-1 True False False
wlc-2 1.28 wlc-2 True False False
wlc-3 1.28 wlc-3 True False False
You can see the new kind cluster that is backing the new K8s cluster:
kind get clusters
hosting-cluster
wlc-1
wlc-2
wlc-3
And you can see the new pod running via Nova control plane:
kubectl --context=${NOVA_CONTROLPLANE_CONTEXT} get all --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
namespace-customer3 pod/pod-customer3 1/1 Running 0 88s
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 13m
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
default deployment.apps/busybox 4/2 4 4 9m27s
You can generate a kubeconfig for the newly created cluster:
kind get kubeconfig --name=wlc-3 > workload-3.config
And you can use it to check the target workload cluster directly:
KUBECONFIG=./workload-3.config kubectl get all -n namespace-customer3
NAME READY STATUS RESTARTS AGE
pod/pod-customer3 1/1 Running 0 3m11s
At this point, you've completed the JIT examples. Please run the following to remove your Nova control plane and workload clusters:
./scripts/teardown_kind_cluster.sh # removes the initially-deployed kind clusters hosting-cluster, wlc-1, wlc-2
kind delete cluster --name wlc-3 # removes the additional wlc-3 kind cluster created by Nova JIT
And you can stop the nova-jit-helper by entering CTL-C in its terminal window.
Cloud Operations
Cloud Account Information
For Nova JIT to perform cloud operations, including getting/setting node group/pool configurations and deleting/recreating/creating clusters and node groups/pools, it requires the information needed to use a cloud account with the appropriate permissions.
For EKS, eksctl is used, which supports access to both managed and unmanaged node groups. The eksctl credentials are passed in the following environment variables, which should be set when the Nova control plane is deployed:
- AWS_ACCESS_KEY_ID -- Set to access key id for AWS account for AWS workload cluster standby
- AWS_SECRET_ACCESS_KEY -- Set to secret access key for AWS account for AWS workload cluster standby
For GKE, gcloud is used; the following environment variables should be set when the Nova control plane is deployed:
- GCE_PROJECT_ID -- Set to project id of GCE account for GCE workload cluster standby
- GCE_ACCESS_KEY -- Set to base64 encoding of GCE service account json file for GCE workload cluster standby
For AKS, az is used; the following environment variables should be set when the Nova control plane is deployed:
- AZURE_TENANT_ID -- Set to tenant id of Azure account for AZURE workload cluster standby
- AZURE_CLIENT_ID -- Set to client id of Azure app registration w/permission to scale, create, and delete clusters
- AZURE_CLIENT_SECRET -- Set to client secret associated with AZURE_CLIENT_ID
Accessing Recreated or Clone-created Clusters
To externally access clusters recreated or clone-created by Nova, a new context config must be created.
- For GKE, obtaining the config for the recreated cluster can be done via:
- gcloud container clusters get-credentials k8s-cluster-name --zone zone-name --project gce-project-name
- For EKS, obtaining the config for the recreated cluster can be done via:
- eksctl utils write-kubeconfig --cluster=k8s-cluster-name --region region-name
- For AKS, obtaining the config for the recreated cluster can be done via:
- az aks get-credentials --resource-group resource_group_name --name cluster_name --overwrite-existing
- For KIND, obtaining the config for the recreated cluster can be done via:
- kind get kubeconfig --name=k8s-cluster-name >k8s-cluster-name.config
Troubleshooting
Logs and Commands
The Nova control plane logs report various information on JIT clusters operations.
For long-running cloud operations, it can be useful to obtain detailed information directly from cloud APIs.
- For EKS, useful commands include:
- eksctl get cluster --name k8s-cluster-name --region region-name
- eksctl get nodegroup --cluster k8s-cluster-name --region region-name
- For GKE, useful commands include:
- gcloud container clusters describe k8s-cluster-name --zone zone-name
Known issues
EKS cluster deletion can sometimes fail; please see https://aws.amazon.com/premiumsupport/knowledge-center/eks-delete-cluster-issues/ for more information.
AKS system nodepools cannot be scaled down to 0; "suspend/resume" standby mode scales them down to 1.
AKS cluster and resource group names should not include underscores, other than that resourceGroupName can be clusterName_group.