AutoVLA: A Vision-Language-Action Model for End-to-End Autonomous Driving with Adaptive Reasoning and Reinforcement Fine-Tuning
Zewei Zhou, Tianhui Cai, Seth Z. Zhao, Yun Zhang, Zhiyu Huang, Bolei Zhou, Jiaqi Ma

TL;DR
AutoVLA is an innovative end-to-end autonomous driving model that integrates reasoning and action generation using a unified autoregressive approach, employing adaptive reasoning and reinforcement fine-tuning for improved performance.
Contribution
The paper introduces AutoVLA, a novel vision-language-action model that unifies reasoning and action generation, and incorporates reinforcement fine-tuning for adaptive, efficient autonomous driving.
Findings
AutoVLA achieves competitive results on nuPlan, nuScenes, Waymo, and CARLA datasets.
The model demonstrates effective adaptive reasoning in diverse driving scenarios.
Reinforcement fine-tuning reduces unnecessary reasoning in simple cases.
Abstract
Recent advancements in Vision-Language-Action (VLA) models have shown promise for end-to-end autonomous driving by leveraging world knowledge and reasoning capabilities. However, current VLA models often struggle with physically infeasible action outputs, complex model structures, or unnecessarily long reasoning. In this paper, we propose AutoVLA, a novel VLA model that unifies reasoning and action generation within a single autoregressive generation model for end-to-end autonomous driving. AutoVLA performs semantic reasoning and trajectory planning directly from raw visual inputs and language instructions. We tokenize continuous trajectories into discrete, feasible actions, enabling direct integration into the language model. For training, we employ supervised fine-tuning to equip the model with dual thinking modes: fast thinking (trajectory-only) and slow thinking (enhanced with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms · Reinforcement Learning in Robotics
MethodsProximal Policy Optimization · CARLA: An Open Urban Driving Simulator
