Session Details: Google Cloud Next 2026

phase 7 playlists session info modal

Day 3 – April 24, 2026

Breakouts

BRK2-120 • Kubernetes, Cloud Runtimes • Technical

How OpenAI builds Kubernetes GPU clusters

Mandalay Bay H

9:45 AM - 10:30 AM

AI model producers are pushing Kubernetes to unprecedented scales. Join us to learn how OpenAI uses Google Cloud’s accelerator infrastructure for complex, multi-node inference. We’ll dive into building and maintaining massive clusters using the latest NVIDIA GB200 and GB300 GPUs, and cover critical concepts like NVLink domains, RDMA over Converged Ethernet (RoCE) networking, and topology-aware scheduling. Get battle-tested tactics for handling node failures and maximizing uptime directly from the teams operating the world’s largest AI workloads. Read more

This agenda widget contains logic for handling same sessions available at multiple times

Session Details

How OpenAI builds Kubernetes GPU clusters

Related Sessions