ImagiDrive: A Unified Imagination-and-Planning Framework for Autonomous Driving
Jingyu Li, Bozhou Zhang, Xin Jin, Jiankang Deng, Xiatian Zhu, Li Zhang

TL;DR
ImagiDrive is an innovative autonomous driving framework that combines vision-language models and world models to enhance scene understanding, prediction, and planning through an integrated imagination-and-planning loop, improving safety and efficiency.
Contribution
This paper introduces ImagiDrive, the first end-to-end system integrating VLMs and DWMs for unified scene imagining and planning in autonomous driving.
Findings
Outperforms previous methods on nuScenes and NAVSIM datasets.
Demonstrates robustness in both open-loop and closed-loop evaluations.
Effective in generating realistic future driving scenarios.
Abstract
Autonomous driving requires rich contextual comprehension and precise predictive reasoning to navigate dynamic and complex environments safely. Vision-Language Models (VLMs) and Driving World Models (DWMs) have independently emerged as powerful recipes addressing different aspects of this challenge. VLMs provide interpretability and robust action prediction through their ability to understand multi-modal context, while DWMs excel in generating detailed and plausible future driving scenarios essential for proactive planning. Integrating VLMs with DWMs is an intuitive, promising, yet understudied strategy to exploit the complementary strengths of accurate behavioral prediction and realistic scene generation. Nevertheless, this integration presents notable challenges, particularly in effectively connecting action-level decisions with high-fidelity pixel-level predictions and maintaining…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Autonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms
