Version: v1.5

GPU Workload Placement

Nova can place GPU-accelerated workloads across one or more Kubernetes clusters, allowing workloads to run where suitable GPU resources are available.

When to Use This

Use this pattern when:

GPU capacity is distributed across multiple clusters
Different clusters provide different GPU types or configurations
AI/ML workloads need to run where GPU resources are available
GPU workloads may need to move as capacity changes
Related application components should be co-located with GPU-backed services

How Nova Helps

Nova evaluates available-resource placement policies and selects a workload cluster with sufficient GPU resources.

GPU-aware placement works with standard Kubernetes resource requests, including:

nvidia.com/gpu
amd.com/gpu
nvidia.com/mig-* (NVIDIA MIG mixed strategy)

Alternatively, GPU-aware placement works with Dynamic Resource Allocation (DRA) for NVIDIA GPUs. DRA is a K8s feature for requesting, configuring, and sharing specialized devices like GPUs via allocating ResouceClaims to matching available ResourceSlices. DRA is available starting with Kubernetes 1.34. NVIDIA's DRA Driver for GPUs works with its GPU Operator to support a number of interesting use cases. Examples of using DRA include the following, which are available in the elotl/try-nova repo, in the examples/dra-gpu directory.

dra-gpu-test.count1.yaml:     Pod with ResourceClaimTemplate requesting an NVIDIA GPU
dra-gpu-test.count2.yaml:     Pod with ResourceClaimTemplate requesting 2 NVIDIA GPUs
dra-gpu-test.t4.yaml:         Pod with ResourceClaimTemplate requesting an NVIDIA T4 GPU
dra-gpu-test.firstavail.yaml: Pod with ResourceClaimTemplate requesting the first available of 2 alternate NVIDIA GPU requests
dra-gpu-test.cap10g.yaml:     Pod with ResourceClaimTemplate requesting an NVIDIA GPU with memory capacity greater than 10Gi
dra-gpu-test.mig1.yaml:       Pod with ResourceClaimTemplate requesting an NVIDIA MIG GPU
dra-gpu-test.2podsshare.yaml: Two pods sharing a ResourceClaim for an NVIDIA GPU
dra-gpu-test.semver.yaml:     Pod with ResourceClaimTemplate requesting an NVIDIA GPU with a driver version greater than 550.127.8

Note that Nova also respects workload constraints such as nodeSelector, which can be used to target specific GPU characteristics.

Considerations

GPU placement depends on the workload clusters being prepared to run GPU workloads. This includes:

GPU-enabled nodes
Appropriate GPU drivers
GPU operators, such as the NVIDIA GPU Operator, where applicable
Accurate resource requests in workload manifests

Available GPU resources can be viewed through the Nova cluster inventory, for example by using:

kubectl  --context=nova get clusters -o wide

This will display GPU, CPU and Memory resources:

NAME    K8S-VERSION   K8S-CLUSTER    NOVA-CREATED   PROVIDER   REGION   ZONE       AVAIL-CPU   AVAIL-MEM     AVAIL-NVIDIAGPU   AVAIL-AMDGPU   READY   IDLE    STANDBY
wlc-1   1.35          worklc-12232   false          azure      eastus   eastus-2   16019m      102957284Ki   3                 0              True    False   False
wlc-2   1.35          worklc-30337   false          azure      eastus   eastus-2   12516m      91274704Ki    3                 0              True    False   False

When to Use This​

How Nova Helps​

Considerations​

When to Use This

How Nova Helps

Considerations