Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Shenyuan Gao, Jiazhi Yang, Li Chen, Kashyap Chitta, Yihang Qiu,, Andreas Geiger, Jun Zhang, Hongyang Li

TL;DR
Vista is a novel driving world model that achieves high fidelity, generalizes well to unseen environments, and offers versatile controllability for autonomous driving applications.
Contribution
The paper introduces Vista, a comprehensive driving world model with innovative loss functions, a latent replacement approach, and a versatile control set, enhancing prediction accuracy and action controllability.
Findings
Outperforms state-of-the-art video generators in over 70% of tests.
Surpasses leading driving world models by 55% in FID and 27% in FVD.
Demonstrates effective generalization across multiple datasets.
Abstract
World models can foresee the outcomes of different actions, which is of paramount importance for autonomous driving. Nevertheless, existing driving world models still have limitations in generalization to unseen environments, prediction fidelity of critical details, and action controllability for flexible application. In this paper, we present Vista, a generalizable driving world model with high fidelity and versatile controllability. Based on a systematic diagnosis of existing methods, we introduce several key ingredients to address these limitations. To accurately predict real-world dynamics at high resolution, we propose two novel losses to promote the learning of moving instances and structural information. We also devise an effective latent replacement approach to inject historical frames as priors for coherent long-horizon rollouts. For action controllability, we incorporate a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAutonomous Vehicle Technology and Safety · Simulation Techniques and Applications
MethodsSparse Evolutionary Training
