TL;DR
RAD-2 introduces a generator-discriminator framework combining diffusion models and reinforcement learning to improve autonomous driving planning, achieving higher safety and smoother driving in urban environments.
Contribution
The paper presents a novel unified generator-discriminator approach with new RL techniques and a high-throughput simulation environment for scalable, stable closed-loop autonomous driving planning.
Findings
Reduced collision rate by 56% compared to diffusion-based planners
Improved perceived safety and driving smoothness in real-world urban traffic
Enhanced training stability through decoupled trajectory generation and evaluation
Abstract
High-level autonomous driving requires motion planners capable of modeling multimodal future uncertainties while remaining robust in closed-loop interactions. Although diffusion-based planners are effective at modeling complex trajectory distributions, they often suffer from stochastic instabilities and the lack of corrective negative feedback when trained purely with imitation learning. To address these issues, we propose RAD-2, a unified generator-discriminator framework for closed-loop planning. Specifically, a diffusion-based generator is used to produce diverse trajectory candidates, while an RL-optimized discriminator reranks these candidates according to their long-term driving quality. This decoupled design avoids directly applying sparse scalar rewards to the full high-dimensional trajectory space, thereby improving optimization stability. To further enhance reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
