Install Nova
Overview
Purpose
This guide provides step-by-step instructions for installing Nova, a control plane and agent system designed to manage multiple Kubernetes clusters. By following this guide, you will set up the Nova Control Plane on a hosting Kubernetes cluster and deploy Nova Agents to workload clusters.
Scope
This guide covers:
- Prerequisites: Requirements before installing Nova.
- Installing novactl: How to download and set up the Nova CLI.
- Deploying Nova: Instructions for deploying the Nova Control Plane and Agents.
- Post-Installation Checks: Verifying the installation.
- Uninstalling Nova: Steps to remove Nova if needed.
Key Concepts
- Nova Control Plane: The central management unit running on a hosting Kubernetes cluster.
- Nova Agent: The component deployed to each workload cluster for management.
- novactl: The command-line interface (CLI) for installing, uninstalling and checking the status of a Nova deployment.
- Workload Cluster: A Kubernetes cluster managed by the Nova Control Plane.
- Hosting Cluster: A Kubernetes cluster where the Nova Control Plane runs.
Prerequisites
- At least 2 Kubernetes clusters up and running. One cluster will be the hosting cluster where nova control plane runs. Other clusters will be workload clusters that are managed by the nova control plane.
- Installed and configured
kubectl
- Nova cannot be deployed to an Autopilot GKE cluster. Please validate that you are deploying to a non-Autopilot cluster.
- Cluster hosting Nova Control Plane MUST have storage provisioner and default
StorageClass
configured. Nova Control Plane uses etcd as a backing store, which runs asStatefulSet
and requiresPersistentVolume
to work.
Kubernetes compatibility
Nova Version | Kubernetes Versions Supported |
---|---|
v0.7 | v1.25, v1.26, v1.27, v1.28 |
v0.6 | v1.24, v1.25 |
Download novactl
novactl
is our CLI that allows you to easily create new Nova Control Planes, register new Nova Workload Clusters, check the health of your Nova cluster, and more!
If you don't have the release tarball then in order to download the latest novactl
version for your OS, run:
curl -s https://api.github.com/repos/elotl/novactl/releases/latest | \
jq -r '.assets[].browser_download_url' | \
grep "$(uname -s | tr '[:upper:]' '[:lower:]')-$(uname -m | sed 's/x86_64/amd64/;s/i386/386/;s/aarch64/arm64/')" | \
xargs -I {} curl -L {} -o novactl
Install novactl
Make the binary executable
Once you have the binary, run:
chmod +x novactl*
Place the binary in your PATH
This step depends on your local setup, but most likely you simply want to run:
sudo mv novactl* /usr/local/bin/novactl
If you accidentally downloadedmore than one novactl binary, please move only the binary that corresponds to the OS and ARCH of your machine to the /usr/local/bin location.
Install it as kubectl plugin
novactl
is ready to work as kubectl plugin. Our docs assume you're using novactl
as kubectl plugin. To make this work, simply run:
sudo novactl kubectl-install
And test if it works:
kubectl nova --help
Usage:
kubectl nova [command]
Available Commands:
get get resources (clusters, schedule groups or deployments) with additional context
help Help about any command
install Install new Nova Control Plane or connect a new workload cluster to Nova Control Plane
kubectl-install kubectl-install installs this binary as kubectl plugin
status Check status of Nova Control Plane installation
uninstall Uninstall Nova Control Plane or disconnect a workload cluster from Nova Control Plane
Flags:
-h, --help help for kubectl-nova
-v, --version version for kubectl-nova
Use "kubectl-nova [command] --help" for more information about a command.
Deploy
To deploy Nova Control Plane to one cluster and use another as a workload cluster, make sure you have at least two contexts in your kubeconfig.
kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
* nova-example-agent-1 nova-example-agent-1 nova-example-agent-1
nova-example-agent-2 nova-example-agent-2 nova-example-agent-2
nova-example-hosting-cluster nova-example-hosting-cluster nova-example-hosting-cluster
Install Nova Control Plane
For installing Nova Control Plane use kubectl nova install control-plane
command
kubectl nova install control-plane --help
Install new Nova Control Plane.
Installs Nova Control Plane components in current cluster, outputs configuration to Nova home directory at ~/.nova
Usage:
kubectl nova install control-plane --context=[hosting cluster context] [CLUSTER NAME] [flags]
Aliases:
control-plane, cp
Examples:
Start Nova Control Plane with default configuration:
kubectl nova install control-plane nova-cp
Flags:
--dry-run If passed, objects are printed out instead of being installed
--gcp-access-key string GCP access key JSON file path. Set for Nova Control Plane running in GKE.
--gcp-project-id string GCP Project ID. Set for Nova Control Plane running in GKE.
-h, --help help for control-plane
--image-repository string (default "elotl/nova-scheduler")
--image-tag string (default "v0.6.1")
--version string (default "v0.6.1")
Global Flags:
--context string
--namespace string (default "elotl")
--rollback Rollback changes made during installation in case it fails.
The Nova Control Plane will be deployed by default to the elotl namespace, which will be created if it does not already exist.
To deploy Nova Control Plane to nova-example-hosting-cluster
, run:
export INSTALL_NAMESPACE=elotl
kubectl nova install control-plane --context nova-example-hosting-cluster --namespace=$INSTALL_NAMESPACE nova
Here, nova
is the name of your Nova Control Plane as well as the context name to interact with the Nova Control Plane. If you choose a different name, remember to replace all uses of --context=nova
in other commands in the documentation to the custom name chosen.
Installing Nova Control Plane... 🪄
Nova Control Plane components installed! 🚀
Nova kubeconfig is stored at /Users/example-user/.nova/nova/nova-kubeconfig
To interact with Nova, run:
export KUBECONFIG=~/.kube/config:/Users/example-user/.nova/nova/nova-kubeconfig
kubectl get cluster --context nova
Troubleshooting Tip
timed out waiting for the condition
when Installing Kube API Server
If you get this output while installing Nova Control Plane:
➜ kubectl nova create control-plane my-control-plane
Installing Nova Control Plane... 🪄
Cluster name - my-control-plane
Creating namespace elotl in control plane
Creating certificates
Generating certificates
Certificates successfully generated.Creating kubeconfig...
kubeconfig created.
Installing Kube API Server...
timed out waiting for the condition
This means that API server of Nova Control Plane and/or its dependencies didn't start properly. What's most likely to cause this is etcd
not starting because of no storage provisioner running in your cluster.
Run:
kubectl get pods -n elotl
NAME READY STATUS RESTARTS AGE
apiserver-6bf98bb5d5-vv7wc 0/1 CrashLoopBackOff 6 (110s ago) 9m42s
etcd-0 0/1 Pending 0 9m42s
kube-controller-manager-76d5d96df-ntl6g 0/1 CrashLoopBackOff 6 (3m42s ago) 9m42s
As you can see, apiserver
and kube-controller-manager
are starting and failing, while etcd is still in Pending
state.
You should follow your Cloud Provider documentation and setup storage provisioner on your cluster. After you're done, run:
kubectl nova uninstall my-control-plane
And install your Nova Control Plane again.
Install nova agent into workload cluster
Each workload cluster needs a Nova agent. The Nova agent will be deployed by default to the elotl namespace. Before deploying Nova agent, you need to ensure that the Nova's init-kubeconfig is present in the elotl namespace. Nova's init-kubeconfig provides a kube config to the Nova Control Plane. This kube config is used by Nova agent in the workload cluster to connect and register itself as a workload cluster in the Nova Control Plane.
Let's create the namespace first:
kubectl --context=nova-example-agent-1 create namespace $INSTALL_NAMESPACE
and copy init-kubeconfig from Nova Control Plane to workload cluster:
kubectl --context=nova get secret -n $INSTALL_NAMESPACE nova-cluster-init-kubeconfig -o yaml | kubectl --context=nova-example-agent-1 apply -f -
To connect a workload cluster to Nova, we will use kubectl nova install agent
command.
kubectl nova install agent --help
Install new Nova Agent.
Installs Nova Agent components in current cluster making it part of Nova Workload Fleet in current Nova Control Plane.
Usage:
novactl install agent [CLUSTER NAME] [flags]
Examples:
Start Nova Agent in current cluster with default configuration:
kubectl nova install agent nova-agent-1
Start Nova Agent in some other context using specific version:
kubectl nova --context some-other-context install agent --image-tag v0.5.0 nova-agent-2
Flags:
--dry-run If passed, objects are printed out instead of being installed
-h, --help help for agent
--image-repository string --image-repository elotl/nova-agent (default "elotl/nova-agent")
--image-tag string --image-tag v0.0.0 (default "v0.6.1")
Global Flags:
--context string
--namespace string (default "elotl")
--rollback Rollback changes made during installation in case it fails.
CLUSTER NAME
is a UNIQUE name of your workload cluster. So to deploy Nova agent to nova-example-agent-1
with "my-workload-cluster-1" name, simply run:
kubectl nova install agent --context nova-example-agent-1 --namespace=$INSTALL_NAMESPACE my-workload-cluster-1
Installing Nova Agent... 🪄
Nova Agent components installed! 🚀
Now lets check if that worked! Simply run:
kubectl get --context=nova clusters
Remember to update path to your Nova Control Plane kubeconfig
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
my-workload-cluster-1 1.25 workload-1 True True False
What if I don't see my workload cluster listed?
If agent install finished without issues and your cluster is not showing up in Nova Control Plane, something went wrong during agent registration process. Run the following command to get agent logs:
kubectl get pods --context nova-example-agent-1 -n elotl -o name -l "app.kubernetes.io/name"="nova-agent" | xargs -I {} kubectl logs --context nova-example-agent-1 -n elotl {}
And start debuging from there!
Install other workload clusters
If you have a second cluster, run the same commands with a different cluster and cluster name, e.g.,
kubectl --context=nova-example-agent-2 create namespace $INSTALL_NAMESPACE
kubectl --context=nova get secret -n $INSTALL_NAMESPACE nova-cluster-init-kubeconfig -o yaml | kubectl --context=nova-example-agent-2 apply -f -
kubectl nova install agent --context nova-example-agent-2 --namespace=$INSTALL_NAMESPACE my-workload-cluster-2
Installing Nova Agent... 🪄
Nova Agent components installed! 🚀
Post installation check
We should now see the newly added workload cluster registered in Nova Control Plane:
kubectl --context=nova get clusters
NAME K8S-VERSION K8S-CLUSTER REGION ZONE READY IDLE STANDBY
my-workload-cluster-1 1.25 us-central1 True True False
my-workload-cluster-2 1.25 us-central1 True True False
To get more insight into the clusters available resources:
kubectl nova --context=nova get clusters
| CLUSTER NAME | K8S VERSION | CLOUD PROVIDER | REGION | STATUS |
|----------------------------------------------------------------------------------------------|
| my-workload-cluster-1 | 1.22 | gce | us-central1 | ClusterReady |
|----------------------------------------------------------------------------------------------|
| NODES |
|----------------------------------------------------------------------------------------------|
| NAME | AVAILABLE | AVAILABLE | AVAILABLE |
| | CPU | MEMORY | GPU |
| |
| gke-nova-example-agent-1-default-pool-25df6493-263w | 399m | 2332068Ki | 0 |
| gke-nova-example-agent-1-default-pool-25df6493-f9f8 | 427m | 2498615680 | 0 |
| |
| NODES' TAINTS |
| |
|----------------------------------------------------------------------------------------------|
| CLUSTER NAME | K8S VERSION | CLOUD PROVIDER | REGION | STATUS |
|----------------------------------------------------------------------------------------------|
| my-workload-cluster-2 | 1.22 | gce | us-central1 | ClusterReady |
|----------------------------------------------------------------------------------------------|
| NODES |
|----------------------------------------------------------------------------------------------|
| NAME | AVAILABLE | AVAILABLE | AVAILABLE |
| | CPU | MEMORY | GPU |
| |
| gke-nova-example-agent-2-default-pool-55fcf389-74zh | 457m | 2460060Ki | 0 |
| gke-nova-example-agent-2-default-pool-55fcf389-n77s | 359m | 2336086400 | 0 |
| gke-nova-example-agent-2-gpu-pool-950c3823-mlqq | 677m | 2354840Ki | 0 |
| |
| NODES' TAINTS |
| |
| gke-nova-example-agent-2-gpu-pool-950c3823-mlqq |
| - nvidia.com/gpu:present:NoSchedule |
|----------------------------------------------------------------------------------------------|
Uninstall Nova
Uninstall Nova Agent from Workload cluster
kubectl nova uninstall agent --context nova-example-agent-1 my-workload-cluster-1
Uninstall Nova Control Plane
To uninstall Nova Control Plane components, run:
kubectl nova uninstall control-plane --context nova-example-hosting-cluster nova
Here, nova
specified as the last argument is the name of your Nova control plane (chosen at the install step).
Default namespace is elotl
.