Session Details: Google Cloud Next 2026

phase 5 session info modal

Day 1 – April 22, 2026

Lightning Talks

DEVLT-224 • Technical

Building an Agentic Platform using Ray Serve LLM and vLLM on GKE

Developer Theater

4:00 PM - 4:25 PM

Discover how to deploy Qwen model on Google Kubernetes Engine (GKE) using Ray Serve and vLLM for high-throughput, low-latency inference. This session provides a guide to integrating an ADK agent for sophisticated chat and tool usage, leveraging TPU-enabled nodes for intensive workloads. Explore Ray native features for autoscaling and fault tolerance while gaining a blueprint to transform LLMs into dynamic "Agentic" systems - a key requirement for enterprises building next-generation AI applications.

This agenda widget contains logic for handling same sessions available at multiple times

Session Details

Building an Agentic Platform using Ray Serve LLM and vLLM on GKE

Related Sessions