Cancelled
CXL-DEVLT-219 • App Dev, Cost Optimization, Gemini Enterprise Agent Platform • Technical
GPU virtualization and orchestration for heterogeneous LLM workloads in Kubernetes
Modern LLM pipelines often suffer from GPU waste or contention in Kubernetes. This session explores advanced virtualization to unlock GPU potential. We cover NVIDIA Multi-Instance GPU (MIG), software sharing, and strategies using GPU Operator and custom schedulers. We demonstrate dynamic MIG provisioning, pod optimization, and QoS enforcement. Learn to design scalable, multi-tenant clusters that maximize throughput and cut costs for diverse LLM workloads in production.
Read more