Breakouts
BRK1-069 • Compute, Open Models • Introductory
Scaling open LLMs: Inside Spotify’s LLM backend on Google Cloud TPUs and GPUs
location_on
Mandalay Bay F
schedule
2:45 PM - 3:30 PM
Deploy open models like the new Gemma 4 on Tensor Processing Units (TPUs) using vLLM, SGLang, or your custom inference stack. Join us to learn how Spotify evaluates models for both quality and performance across a range of GPUs and TPUs to deliver amazing experiences to hundreds of millions of users daily. In this session, we’ll focus on how TPUs are becoming easier than ever to add to your existing hardware portfolio, enabling seamless deployment of PyTorch and JAX models on Trillium and Ironwood TPUs.
Read more