Breakouts
BRK2-131 • Kubernetes, Cloud Runtimes • Technical
Build beyond the CPU: Native AI and agentic autoscaling on GKE
location_on
Jasmine A
schedule
1:45 PM - 2:30 PM
Standard autoscaling based on CPU and memory metrics often forces a choice: either over-provision resources or risk latency spikes during demand surges. Join this session for an introduction to intent-based autoscaling in Google Kubernetes Engine (GKE), a paradigm that adds automated rightsizing and custom workload metrics as native capabilities. Learn to architect native scaling based on application metrics such as queue depths, agent tool use, or latency histograms, eliminating the need for external adapters or sidecars. And discover how to move beyond generic resource limits to create a fluid infrastructure that reacts to real-time AI demand.
Read more