EponaV2: Driving World Model with Comprehensive Future Reasoning
Jiawei Xu, Zhizhou Zhong, Zhijian Shu, Mingkai Jia, Mingxiao Li, Jia-Wang Bian, Qian Zhang, Kaicheng Zhang, Jin Xie, Jian Yang, Wei Yin

TL;DR
EponaV2 is a novel driving world model that forecasts comprehensive future scene representations, enhancing real-world reasoning and trajectory planning without relying on manual annotations.
Contribution
It introduces a new paradigm that predicts detailed future geometry and semantics, inspired by human anticipation, and employs a flow matching policy optimization for improved planning.
Findings
Achieves state-of-the-art performance on NAVSIM benchmarks.
Improves trajectory planning accuracy by forecasting comprehensive future scene data.
Demonstrates effective perception-free driving world modeling.
Abstract
Data scaling plays a pivotal role in the pursuit of general intelligence. However, the prevailing perception-planning paradigm in autonomous driving relies heavily on expensive manual annotations to supervise trajectory planning, which severely limits its scalability. Conversely, although existing perception-free driving world models achieve impressive driving performance, their real-world reasoning ability for planning is solely built on next frame image forecasting. Due to the lack of enough supervision, these models often struggle with comprehensive scene understanding, resulting in unsatisfactory trajectory planning. In this paper, we propose EponaV2, a novel paradigm of driving world models, which achieves high-quality planning with comprehensive future reasoning. Inspired by how human drivers anticipate 3D geometry and semantics, we train our model to forecast more comprehensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
