ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous Driving

Zihao Sheng; Xin Ye; Jingru Luo; Sikai Chen; Liu Ren

arXiv:2604.02714·cs.CV·April 6, 2026

ExploreVLA: Dense World Modeling and Exploration for End-to-End Autonomous Driving

Zihao Sheng, Xin Ye, Jingru Luo, Sikai Chen, Liu Ren

PDF

1 Repo

TL;DR

ExploreVLA introduces a unified framework combining world modeling and exploration for end-to-end autonomous driving, enhancing policy learning beyond imitation through dense supervision and intrinsic rewards.

Contribution

It proposes a novel approach that integrates world modeling with exploration signals, improving autonomous driving policies in diverse scenarios.

Findings

01

Achieved state-of-the-art scores on NAVSIM benchmark.

02

Demonstrated effective exploration using image prediction uncertainty.

03

Enhanced visual and geometric representations for planning.

Abstract

End-to-end autonomous driving models based on Vision-Language-Action (VLA) architectures have shown promising results by learning driving policies through behavior cloning on expert demonstrations. However, imitation learning inherently limits the model to replicating observed behaviors without exploring diverse driving strategies, leaving it brittle in novel or out-of-distribution scenarios. Reinforcement learning (RL) offers a natural remedy by enabling policy exploration beyond the expert distribution. Yet VLA models, typically trained on offline datasets, lack directly observable state transitions, necessitating a learned world model to anticipate action consequences. In this work, we propose a unified understanding-and-generation framework that leverages world modeling to simultaneously enable meaningful exploration and provide dense supervision. Specifically, we augment trajectory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

https://zihaosheng.github.io/ExploreVLA
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.