Kubernetes GPU Autoscaling: How To Scale GPU Workloads With CAST AI

Company

Cast AI

Date Published

Aug. 31, 2023

Author

Valdas Rakutis

Word count

1665

Language

English

Hacker News points

None

URL

cast.ai/blog/kubernetes-gpu-autoscaling-how-to-scale-gpu-workloads-with-cast-ai

Summary

The managed Kubernetes solutions from major cloud providers like AWS, Google Cloud Platform, and Azure usually have capabilities to autoscale GPU node pools. However, Kubernetes GPU autoscaling quickly gets tricky as you have to configure GPU node pools manually, which may lead to nodes lingering for a while, increasing your cluster costs. CAST AI's autoscaling and bin packing engine provisions GPU instances on demand and downscales them when needed, taking advantage of spot instances and their price benefits to drive costs down further. Currently, CAST AI supports GPU workloads on Amazon Elastic Kubernetes Service (EKS) and Google Kubernetes Engine (GKE), with support for Azure Kubernetes Service (AKS) coming soon.