Poutine: Vision-Language-Trajectory Pre-Training and Reinforcement Learning Post-Training Enable Robust End-to-End Autonomous Driving
Luke Rowe, Rodrigue de Schaetzen, Roger Girgis, Christopher Pal, Liam Paull

TL;DR
This paper introduces Poutine, a scalable vision-language-trajectory pretraining and reinforcement learning approach that enables robust end-to-end autonomous driving without additional model components, achieving top performance on the Waymo benchmark.
Contribution
The work demonstrates that large vision-language models can be effectively adapted for autonomous driving through simple pretraining and lightweight RL fine-tuning, eliminating the need for handcrafted tokenizers or complex architectures.
Findings
Achieved 1st place in Waymo Challenge with RFS of 7.99
Scalable VLT pretraining improves driving robustness
Lightweight RL fine-tuning enhances performance in long-tail scenarios
Abstract
Maintaining good driving behavior in out-of-distribution scenarios remains a critical challenge in autonomous driving. A promising direction is to leverage the generalist knowledge and reasoning capabilities of large-language models by treating unusual driving scenarios as a logical reasoning task. In this work, we present Poutine, a method that uses an off-the-shelf 3B-parameter vision-language model (VLM) - without any additional components - to achieve robust end-to-end autonomous driving via a simple and scalable training recipe. To learn strong base driving capabilities, we first train Poutine-Base using self-supervised next-token prediction over vision, language, and trajectory (VLT) tokens, leveraging both nominal and long-tail driving data. In the second stage, we fine-tune Poutine-Base using Group Relative Policy Optimization (GRPO) with a small set of human preference-labeled…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Path Planning Algorithms · Advanced Neural Network Applications · Vehicle License Plate Recognition
MethodsBalanced Selection
