X-Driver: Explainable Autonomous Driving with Vision-Language Models
Wei Liu, Jiyuan Zhang, Binxiong Zheng, Yufeng Hu, Yingzhan Lin, Zengfeng Zeng

TL;DR
X-Driver introduces a multi-modal large language model framework for autonomous driving that improves closed-loop performance and interpretability, validated through simulation benchmarks and surpassing current state-of-the-art methods.
Contribution
The paper presents X-Driver, a novel multi-modal large language model framework utilizing Chain-of-Thought and autoregressive modeling for enhanced autonomous driving.
Findings
X-Driver outperforms existing methods in closed-loop driving tasks.
X-Driver improves interpretability of driving decisions.
Validated on CARLA benchmarks with superior results.
Abstract
End-to-end autonomous driving has advanced significantly, offering benefits such as system simplicity and stronger driving performance in both open-loop and closed-loop settings than conventional pipelines. However, existing frameworks still suffer from low success rates in closed-loop evaluations, highlighting their limitations in real-world deployment. In this paper, we introduce X-Driver, a unified multi-modal large language models(MLLMs) framework designed for closed-loop autonomous driving, leveraging Chain-of-Thought(CoT) and autoregressive modeling to enhance perception and decision-making. We validate X-Driver across multiple autonomous driving tasks using public benchmarks in CARLA simulation environment, including Bench2Drive[6]. Our experimental results demonstrate superior closed-loop performance, surpassing the current state-of-the-art(SOTA) while improving the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Autonomous Vehicle Technology and Safety · Explainable Artificial Intelligence (XAI)
MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator
