Session Details: Google Cloud Next 2026

phase 7 playlists session info modal

Day 2 – April 23, 2026

Breakouts

BRK1-069 • Compute, Open Models • Introductory

Scaling open LLMs: Inside Spotify’s LLM backend on Google Cloud TPUs and GPUs

Mandalay Bay F

2:45 PM - 3:30 PM

Deploy open models like the new Gemma 4 on Tensor Processing Units (TPUs) using vLLM, SGLang, or your custom inference stack. Join us to learn how Spotify evaluates models for both quality and performance across a range of GPUs and TPUs to deliver amazing experiences to hundreds of millions of users daily. In this session, we’ll focus on how TPUs are becoming easier than ever to add to your existing hardware portfolio, enabling seamless deployment of PyTorch and JAX models on Trillium and Ironwood TPUs.

This agenda widget contains logic for handling same sessions available at multiple times

Session Details

Scaling open LLMs: Inside Spotify’s LLM backend on Google Cloud TPUs and GPUs

Related Sessions