Meetup··Jaipur, India

Stop the GPU Madness! Making LLM Inference Actually Efficient on K8s

AWS User Group Jaipur

LLMKubernetesGPUInferenceAWS

Abstract

AWS User Group Jaipur — main auditorium, RIC Jaipur. A meetup talk on running LLM inference workloads on Kubernetes without burning through GPU budgets.

Resources

More Talks