Skip to main content
Version: v1.4

GPU Workload Placement

Nova can place GPU-accelerated workloads across one or more Kubernetes clusters, allowing workloads to run where suitable GPU resources are available.

When to Use This

Use this pattern when:

  • GPU capacity is distributed across multiple clusters
  • Different clusters provide different GPU types or configurations
  • AI/ML workloads need to run where GPU resources are available
  • GPU workloads may need to move as capacity changes
  • Related application components should be co-located with GPU-backed services

How Nova Helps

Nova evaluates available-resource placement policies and selects a workload cluster with sufficient GPU resources.

GPU-aware placement works with standard Kubernetes resource requests, including:

  • nvidia.com/gpu
  • amd.com/gpu
  • nvidia.com/mig-* (NVIDIA MIG mixed strategy)

Alternatively, GPU-aware placement works with Dynamic Resource Allocation (DRA) for NVIDIA GPUs. DRA is a K8s feature for requesting, configuring, and sharing specialized devices like GPUs via allocating ResouceClaims to matching available ResourceSlices. DRA is available starting with Kubernetes 1.34. NVIDIA's DRA Driver for GPUs works with its GPU Operator to support a number of interesting use cases. Examples of using DRA include the following, which are available in the elotl/try-nova repo, in the examples/dra-gpu directory.

dra-gpu-test.count1.yaml: Pod with ResourceClaimTemplate requesting an NVIDIA GPU
dra-gpu-test.count2.yaml: Pod with ResourceClaimTemplate requesting 2 NVIDIA GPUs
dra-gpu-test.t4.yaml: Pod with ResourceClaimTemplate requesting an NVIDIA T4 GPU
dra-gpu-test.firstavail.yaml: Pod with ResourceClaimTemplate requesting the first available of 2 alternate NVIDIA GPU requests
dra-gpu-test.cap10g.yaml: Pod with ResourceClaimTemplate requesting an NVIDIA GPU with memory capacity greater than 10Gi
dra-gpu-test.mig1.yaml: Pod with ResourceClaimTemplate requesting an NVIDIA MIG GPU
dra-gpu-test.2podsshare.yaml: Two pods sharing a ResourceClaim for an NVIDIA GPU
dra-gpu-test.semver.yaml: Pod with ResourceClaimTemplate requesting an NVIDIA GPU with a driver version greater than 550.127.8

Note that Nova also respects workload constraints such as nodeSelector, which can be used to target specific GPU characteristics.

Considerations

GPU placement depends on the workload clusters being prepared to run GPU workloads. This includes:

  • GPU-enabled nodes
  • Appropriate GPU drivers
  • GPU operators, such as the NVIDIA GPU Operator, where applicable
  • Accurate resource requests in workload manifests

Available GPU resources can be viewed through the Nova cluster inventory, for example by using:

kubectl --context=nova get clusters -o wide

This will display GPU, CPU and Memory resources:

NAME K8S-VERSION K8S-CLUSTER NOVA-CREATED PROVIDER REGION ZONE AVAIL-CPU AVAIL-MEM AVAIL-NVIDIAGPU AVAIL-AMDGPU READY IDLE STANDBY
wlc-1 1.35 worklc-12232 false azure eastus eastus-2 16019m 102957284Ki 3 0 True False False
wlc-2 1.35 worklc-30337 false azure eastus eastus-2 12516m 91274704Ki 3 0 True False False