DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous   Driving

Bencheng Liao; Shaoyu Chen; Haoran Yin; Bo Jiang; Cheng Wang; Sixu; Yan; Xinbang Zhang; Xiangyu Li; Ying Zhang; Qian Zhang; Xinggang Wang

arXiv:2411.15139·cs.CV·April 11, 2025

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao, Shaoyu Chen, Haoran Yin, Bo Jiang, Cheng Wang, Sixu, Yan, Xinbang Zhang, Xiangyu Li, Ying Zhang, Qian Zhang, Xinggang Wang

PDF

Open Access 1 Repo 1 Models

TL;DR

DiffusionDrive introduces a truncated diffusion model with multi-mode anchors and an efficient decoder, enabling real-time, diverse, and high-quality autonomous driving actions with fewer denoising steps.

Contribution

It proposes a novel truncated diffusion policy with multi-mode anchors and an efficient cascade decoder for end-to-end autonomous driving.

Findings

01

Achieves 10× reduction in denoising steps compared to vanilla diffusion policies.

02

Sets a new record of 88.1 PDMS on NAVSIM dataset.

03

Runs at 45 FPS on NVIDIA 4090, demonstrating real-time performance.

Abstract

Recently, the diffusion model has emerged as a powerful generative technique for robotic policy learning, capable of modeling multi-mode action distributions. Leveraging its capability for end-to-end autonomous driving is a promising direction. However, the numerous denoising steps in the robotic diffusion policy and the more dynamic, open-world nature of traffic scenes pose substantial challenges for generating diverse driving actions at a real-time speed. To address these challenges, we propose a novel truncated diffusion policy that incorporates prior multi-mode anchors and truncates the diffusion schedule, enabling the model to learn denoising from anchored Gaussian distribution to the multi-mode driving action distribution. Additionally, we design an efficient cascade diffusion decoder for enhanced interaction with conditional scene context. The proposed model, DiffusionDrive,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hustvl/diffusiondrive
pytorchOfficial

Models

🤗
hustvl/DiffusionDrive
model· ♡ 5
♡ 5

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTransportation and Mobility Innovations · Traffic control and management

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Diffusion