Skip to main content
Version: v0.7.1

Install Nova

Overview

Purpose

This guide provides step-by-step instructions for installing Nova, a control plane and agent system designed to manage multiple Kubernetes clusters. By following this guide, you will set up the Nova Control Plane on a hosting Kubernetes cluster and deploy Nova Agents to workload clusters.

Scope

This guide covers:

  • Prerequisites: Requirements before installing Nova.
  • Installing novactl: How to download and set up the Nova CLI.
  • Deploying Nova: Instructions for deploying the Nova Control Plane and Agents.
  • Post-Installation Checks: Verifying the installation.
  • Uninstalling Nova: Steps to remove Nova if needed.

Key Concepts

  • Nova Control Plane: The central management unit running on a hosting Kubernetes cluster.
  • Nova Agent: The component deployed to each workload cluster for management.
  • novactl: The command-line interface (CLI) for installing, uninstalling and checking the status of a Nova deployment.
  • Workload Cluster: A Kubernetes cluster managed by the Nova Control Plane.
  • Hosting Cluster: A Kubernetes cluster where the Nova Control Plane runs.

Prerequisites

  1. At least 2 Kubernetes clusters up and running. One cluster will be the hosting cluster where nova control plane runs. Other clusters will be workload clusters that are managed by the nova control plane.
  2. Installed and configured kubectl
  3. Nova cannot be deployed to an Autopilot GKE cluster. Please validate that you are deploying to a non-Autopilot cluster.
  4. Cluster hosting Nova Control Plane MUST have storage provisioner and default StorageClass configured. Nova Control Plane uses etcd as a backing store, which runs as StatefulSet and requires PersistentVolume to work.

Kubernetes compatibility

Nova VersionKubernetes Versions Supported
v0.7v1.25, v1.26, v1.27, v1.28
v0.6v1.24, v1.25

Download novactl

novactl is our CLI that allows you to easily create new Nova Control Planes, register new Nova Workload Clusters, check the health of your Nova cluster, and more!

If you don't have the release tarball then in order to download the latest novactl version for your OS, run:

curl -s https://api.github.com/repos/elotl/novactl/releases/latest | \
jq -r '.assets[].browser_download_url' | \
grep "$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m | sed 's/x86_64/amd64/;s/i386/386/;s/aarch64/arm64/')" | \
xargs -I {} curl -L {} -o novactl

Install novactl

Make the binary executable

Once you have the binary, run:

chmod +x novactl*

Place the binary in your PATH

This step depends on your local setup, but most likely you simply want to run:

sudo mv novactl* /usr/local/bin/novactl

If you accidentally downloadedmore than one novactl binary, please move only the binary that corresponds to the OS and ARCH of your machine to the /usr/local/bin location.

Install it as kubectl plugin

novactl is ready to work as kubectl plugin. Our docs assume you're using novactl as kubectl plugin. To make this work, simply run:

sudo novactl kubectl-install

And test if it works:

kubectl nova --help
Usage:
kubectl nova [command]

Available Commands:
get get resources (clusters, schedule groups or deployments) with additional context
help Help about any command
install Install new Nova Control Plane or connect a new workload cluster to Nova Control Plane
kubectl-install kubectl-install installs this binary as kubectl plugin
status Check status of Nova Control Plane installation
uninstall Uninstall Nova Control Plane or disconnect a workload cluster from Nova Control Plane

Flags:
-h, --help help for kubectl-nova
-v, --version version for kubectl-nova

Use "kubectl-nova [command] --help" for more information about a command.

Deploy

To deploy Nova Control Plane to one cluster and use another as a workload cluster, make sure you have at least two contexts in your kubeconfig.

kubectl config get-contexts
CURRENT   NAME                                                     CLUSTER                                                  AUTHINFO                                                 NAMESPACE
* nova-example-agent-1 nova-example-agent-1 nova-example-agent-1
nova-example-agent-2 nova-example-agent-2 nova-example-agent-2
nova-example-hosting-cluster nova-example-hosting-cluster nova-example-hosting-cluster

Install Nova Control Plane

For installing Nova Control Plane use kubectl nova install control-plane command

kubectl nova install control-plane --help
Install new Nova Control Plane.

Installs Nova Control Plane components in current cluster, outputs configuration to Nova home directory at ~/.nova

Usage:
kubectl nova install control-plane --context=[hosting cluster context] [CLUSTER NAME] [flags]

Aliases:
control-plane, cp

Examples:

Start Nova Control Plane with default configuration:

kubectl nova install control-plane nova-cp


Flags:
--dry-run If passed, objects are printed out instead of being installed
--gcp-access-key string GCP access key JSON file path. Set for Nova Control Plane running in GKE.
--gcp-project-id string GCP Project ID. Set for Nova Control Plane running in GKE.
-h, --help help for control-plane
--image-repository string (default "elotl/nova-scheduler")
--image-tag string (default "v0.6.1")
--version string (default "v0.6.1")

Global Flags:
--context string
--namespace string (default "elotl")
--rollback Rollback changes made during installation in case it fails.

The Nova Control Plane will be deployed by default to the elotl namespace, which will be created if it does not already exist.

To deploy Nova Control Plane to nova-example-hosting-cluster, run:

export INSTALL_NAMESPACE=elotl
kubectl nova install control-plane --context nova-example-hosting-cluster --namespace=$INSTALL_NAMESPACE nova

Here, nova is the name of your Nova Control Plane as well as the context name to interact with the Nova Control Plane. If you choose a different name, remember to replace all uses of --context=nova in other commands in the documentation to the custom name chosen.

Installing Nova Control Plane... 🪄
Nova Control Plane components installed! 🚀

Nova kubeconfig is stored at /Users/example-user/.nova/nova/nova-kubeconfig

To interact with Nova, run:
export KUBECONFIG=~/.kube/config:/Users/example-user/.nova/nova/nova-kubeconfig
kubectl get cluster --context nova

Troubleshooting Tip

timed out waiting for the condition when Installing Kube API Server

If you get this output while installing Nova Control Plane:

➜ kubectl nova create control-plane my-control-plane
Installing Nova Control Plane... 🪄
Cluster name - my-control-plane
Creating namespace elotl in control plane
Creating certificates
Generating certificates
Certificates successfully generated.Creating kubeconfig...
kubeconfig created.
Installing Kube API Server...
timed out waiting for the condition

This means that API server of Nova Control Plane and/or its dependencies didn't start properly. What's most likely to cause this is etcd not starting because of no storage provisioner running in your cluster.

Run:

kubectl get pods -n elotl
NAME                                      READY   STATUS             RESTARTS        AGE
apiserver-6bf98bb5d5-vv7wc 0/1 CrashLoopBackOff 6 (110s ago) 9m42s
etcd-0 0/1 Pending 0 9m42s
kube-controller-manager-76d5d96df-ntl6g 0/1 CrashLoopBackOff 6 (3m42s ago) 9m42s

As you can see, apiserver and kube-controller-manager are starting and failing, while etcd is still in Pending state.

You should follow your Cloud Provider documentation and setup storage provisioner on your cluster. After you're done, run:

kubectl nova uninstall my-control-plane

And install your Nova Control Plane again.

Install nova agent into workload cluster

Each workload cluster needs a Nova agent. The Nova agent will be deployed by default to the elotl namespace. Before deploying Nova agent, you need to ensure that the Nova's init-kubeconfig is present in the elotl namespace. Nova's init-kubeconfig provides a kube config to the Nova Control Plane. This kube config is used by Nova agent in the workload cluster to connect and register itself as a workload cluster in the Nova Control Plane.

Let's create the namespace first:

kubectl --context=nova-example-agent-1 create namespace $INSTALL_NAMESPACE

and copy init-kubeconfig from Nova Control Plane to workload cluster:

kubectl --context=nova get secret -n $INSTALL_NAMESPACE nova-cluster-init-kubeconfig -o yaml | kubectl --context=nova-example-agent-1 apply -f -

To connect a workload cluster to Nova, we will use kubectl nova install agent command.

kubectl nova install agent --help
Install new Nova Agent.

Installs Nova Agent components in current cluster making it part of Nova Workload Fleet in current Nova Control Plane.

Usage:
novactl install agent [CLUSTER NAME] [flags]

Examples:

Start Nova Agent in current cluster with default configuration:

kubectl nova install agent nova-agent-1

Start Nova Agent in some other context using specific version:

kubectl nova --context some-other-context install agent --image-tag v0.5.0 nova-agent-2


Flags:
--dry-run If passed, objects are printed out instead of being installed
-h, --help help for agent
--image-repository string --image-repository elotl/nova-agent (default "elotl/nova-agent")
--image-tag string --image-tag v0.0.0 (default "v0.6.1")

Global Flags:
--context string
--namespace string (default "elotl")
--rollback Rollback changes made during installation in case it fails.

CLUSTER NAME is a UNIQUE name of your workload cluster. So to deploy Nova agent to nova-example-agent-1 with "my-workload-cluster-1" name, simply run:

kubectl nova install agent --context nova-example-agent-1 --namespace=$INSTALL_NAMESPACE my-workload-cluster-1
Installing Nova Agent... 🪄
Nova Agent components installed! 🚀

Now lets check if that worked! Simply run:

kubectl get --context=nova clusters

Remember to update path to your Nova Control Plane kubeconfig

NAME                    K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
my-workload-cluster-1 1.25 workload-1 True True False

What if I don't see my workload cluster listed?

If agent install finished without issues and your cluster is not showing up in Nova Control Plane, something went wrong during agent registration process. Run the following command to get agent logs:

kubectl get pods --context nova-example-agent-1 -n elotl -o name -l "app.kubernetes.io/name"="nova-agent" | xargs -I {} kubectl logs --context nova-example-agent-1 -n elotl {}

And start debuging from there!

Install other workload clusters

If you have a second cluster, run the same commands with a different cluster and cluster name, e.g.,

kubectl --context=nova-example-agent-2 create namespace $INSTALL_NAMESPACE
kubectl --context=nova get secret -n $INSTALL_NAMESPACE nova-cluster-init-kubeconfig -o yaml | kubectl --context=nova-example-agent-2 apply -f -
kubectl nova install agent --context nova-example-agent-2 --namespace=$INSTALL_NAMESPACE my-workload-cluster-2
Installing Nova Agent... 🪄
Nova Agent components installed! 🚀

Post installation check

We should now see the newly added workload cluster registered in Nova Control Plane:

kubectl --context=nova get clusters
NAME              K8S-VERSION   K8S-CLUSTER   REGION   ZONE   READY   IDLE   STANDBY
my-workload-cluster-1 1.25 us-central1 True True False
my-workload-cluster-2 1.25 us-central1 True True False

To get more insight into the clusters available resources:

kubectl nova --context=nova get clusters
  | CLUSTER NAME                  | K8S VERSION | CLOUD PROVIDER | REGION        | STATUS        |
|----------------------------------------------------------------------------------------------|
| my-workload-cluster-1 | 1.22 | gce | us-central1 | ClusterReady |
|----------------------------------------------------------------------------------------------|
| NODES |
|----------------------------------------------------------------------------------------------|
| NAME | AVAILABLE | AVAILABLE | AVAILABLE |
| | CPU | MEMORY | GPU |
| |
| gke-nova-example-agent-1-default-pool-25df6493-263w | 399m | 2332068Ki | 0 |
| gke-nova-example-agent-1-default-pool-25df6493-f9f8 | 427m | 2498615680 | 0 |
| |
| NODES' TAINTS |
| |
|----------------------------------------------------------------------------------------------|



| CLUSTER NAME | K8S VERSION | CLOUD PROVIDER | REGION | STATUS |
|----------------------------------------------------------------------------------------------|
| my-workload-cluster-2 | 1.22 | gce | us-central1 | ClusterReady |
|----------------------------------------------------------------------------------------------|
| NODES |
|----------------------------------------------------------------------------------------------|
| NAME | AVAILABLE | AVAILABLE | AVAILABLE |
| | CPU | MEMORY | GPU |
| |
| gke-nova-example-agent-2-default-pool-55fcf389-74zh | 457m | 2460060Ki | 0 |
| gke-nova-example-agent-2-default-pool-55fcf389-n77s | 359m | 2336086400 | 0 |
| gke-nova-example-agent-2-gpu-pool-950c3823-mlqq | 677m | 2354840Ki | 0 |
| |
| NODES' TAINTS |
| |
| gke-nova-example-agent-2-gpu-pool-950c3823-mlqq |
| - nvidia.com/gpu:present:NoSchedule |
|----------------------------------------------------------------------------------------------|

Uninstall Nova

Uninstall Nova Agent from Workload cluster

 kubectl nova uninstall agent --context nova-example-agent-1 my-workload-cluster-1

Uninstall Nova Control Plane

To uninstall Nova Control Plane components, run:

kubectl nova uninstall control-plane --context nova-example-hosting-cluster nova

Here, nova specified as the last argument is the name of your Nova control plane (chosen at the install step). Default namespace is elotl.