DriveLaW:Unifying Planning and Video Generation in a Latent Driving World

Tianze Xia; Yongkang Li; Lijun Zhou; Jingfeng Yao; Kaixin Xiong; Haiyang Sun; Bing Wang; Kun Ma; Guang Chen; Hangjun Ye; Wenyu Liu; Xinggang Wang

arXiv:2512.23421·cs.CV·April 20, 2026

DriveLaW:Unifying Planning and Video Generation in a Latent Driving World

Tianze Xia, Yongkang Li, Lijun Zhou, Jingfeng Yao, Kaixin Xiong, Haiyang Sun, Bing Wang, Kun Ma, Guang Chen, Hangjun Ye, Wenyu Liu, Xinggang Wang

PDF

1 Models

TL;DR

DriveLaW introduces a unified framework combining high-fidelity video prediction and reliable motion planning for autonomous driving, improving both tasks through a shared latent representation.

Contribution

It proposes DriveLaW, a novel paradigm that unifies video generation and motion planning using a shared latent space, with a three-stage training strategy.

Findings

01

Surpasses state-of-the-art in video prediction metrics (FID and FVD).

02

Achieves a new record on the NAVSIM planning benchmark.

03

Demonstrates consistent and reliable trajectory planning from generated video latent representations.

Abstract

World models have become crucial for autonomous driving, as they learn how scenarios evolve over time to address the long-tail challenges of the real world. However, current approaches relegate world models to limited roles: they operate within ostensibly unified architectures that still keep world prediction and motion planning as decoupled processes. To bridge this gap, we propose DriveLaW, a novel paradigm that unifies video generation and motion planning. By directly injecting the latent representation from its video generator into the planner, DriveLaW ensures inherent consistency between high-fidelity future generation and reliable trajectory planning. Specifically, DriveLaW consists of two core components: DriveLaW-Video, our powerful world model that generates high-fidelity forecasting with expressive latent representations, and DriveLaW-Act, a diffusion planner that generates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
tz2026/DriveLaW
model· ♡ 1
♡ 1

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.