MindDrive: An All-in-One Framework Bridging World Models and Vision-Language Model for End-to-End Autonomous Driving
Bin Sun, Yaoguang Cao, Yan Wang, Rui Wang, Jiachen Shang, Xiejie Feng, Jiayi Lu, Jia Shi, Shichun Yang, Xiaoyu Yan, Ziying Song

TL;DR
MindDrive is a comprehensive autonomous driving framework that combines trajectory generation with decision reasoning, utilizing world models and vision-language models to improve safety, efficiency, and interpretability in complex driving scenarios.
Contribution
The paper introduces MindDrive, integrating a future-aware trajectory generator with a vision-language evaluator for end-to-end autonomous driving, enabling multi-objective decision making and improved performance.
Findings
Achieves state-of-the-art results on NAVSIM benchmarks.
Enhances safety, compliance, and generalization in autonomous driving.
Provides an interpretable, cognitively guided driving framework.
Abstract
End-to-End autonomous driving (E2E-AD) has emerged as a new paradigm, where trajectory planning plays a crucial role. Existing studies mainly follow two directions: trajectory generation oriented, which focuses on producing high-quality trajectories with simple decision mechanisms, and trajectory selection oriented, which performs multi-dimensional evaluation to select the best trajectory yet lacks sufficient generative capability. In this work, we propose MindDrive, a harmonized framework that integrates high-quality trajectory generation with comprehensive decision reasoning. It establishes a structured reasoning paradigm of "context simulation - candidate generation - multi-objective trade-off". In particular, the proposed Future-aware Trajectory Generator (FaTG), based on a World Action Model (WaM), performs ego-conditioned "what-if" simulations to predict potential future scenes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms · Human-Automation Interaction and Safety
