RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Hao Gao; Shaoyu Chen; Yifan Zhu; Yuehao Song; Wenyu Liu; Qian Zhang; Xinggang Wang

arXiv:2604.15308·cs.CV·April 17, 2026

RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

Hao Gao, Shaoyu Chen, Yifan Zhu, Yuehao Song, Wenyu Liu, Qian Zhang, Xinggang Wang

PDF

1 Repo

TL;DR

RAD-2 introduces a generator-discriminator framework combining diffusion models and reinforcement learning to improve autonomous driving planning, achieving higher safety and smoother driving in urban environments.

Contribution

The paper presents a novel unified generator-discriminator approach with new RL techniques and a high-throughput simulation environment for scalable, stable closed-loop autonomous driving planning.

Findings

01

Reduced collision rate by 56% compared to diffusion-based planners

02

Improved perceived safety and driving smoothness in real-world urban traffic

03

Enhanced training stability through decoupled trajectory generation and evaluation

Abstract

High-level autonomous driving requires motion planners capable of modeling multimodal future uncertainties while remaining robust in closed-loop interactions. Although diffusion-based planners are effective at modeling complex trajectory distributions, they often suffer from stochastic instabilities and the lack of corrective negative feedback when trained purely with imitation learning. To address these issues, we propose RAD-2, a unified generator-discriminator framework for closed-loop planning. Specifically, a diffusion-based generator is used to produce diverse trajectory candidates, while an RL-optimized discriminator reranks these candidates according to their long-term driving quality. This decoupled design avoids directly applying sparse scalar rewards to the full high-dimensional trajectory space, thereby improving optimization stability. To further enhance reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hustvl/RAD
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.