EponaV2: Driving World Model with Comprehensive Future Reasoning

Jiawei Xu; Zhizhou Zhong; Zhijian Shu; Mingkai Jia; Mingxiao Li; Jia-Wang Bian; Qian Zhang; Kaicheng Zhang; Jin Xie; Jian Yang; Wei Yin

arXiv:2605.14696·cs.CV·May 15, 2026

EponaV2: Driving World Model with Comprehensive Future Reasoning

Jiawei Xu, Zhizhou Zhong, Zhijian Shu, Mingkai Jia, Mingxiao Li, Jia-Wang Bian, Qian Zhang, Kaicheng Zhang, Jin Xie, Jian Yang, Wei Yin

PDF

TL;DR

EponaV2 is a novel driving world model that forecasts comprehensive future scene representations, enhancing real-world reasoning and trajectory planning without relying on manual annotations.

Contribution

It introduces a new paradigm that predicts detailed future geometry and semantics, inspired by human anticipation, and employs a flow matching policy optimization for improved planning.

Findings

01

Achieves state-of-the-art performance on NAVSIM benchmarks.

02

Improves trajectory planning accuracy by forecasting comprehensive future scene data.

03

Demonstrates effective perception-free driving world modeling.

Abstract

Data scaling plays a pivotal role in the pursuit of general intelligence. However, the prevailing perception-planning paradigm in autonomous driving relies heavily on expensive manual annotations to supervise trajectory planning, which severely limits its scalability. Conversely, although existing perception-free driving world models achieve impressive driving performance, their real-world reasoning ability for planning is solely built on next frame image forecasting. Due to the lack of enough supervision, these models often struggle with comprehensive scene understanding, resulting in unsatisfactory trajectory planning. In this paper, we propose EponaV2, a novel paradigm of driving world models, which achieves high-quality planning with comprehensive future reasoning. Inspired by how human drivers anticipate 3D geometry and semantics, we train our model to forecast more comprehensive…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.