Imagine-2-Drive: Leveraging High-Fidelity World Models via Multi-Modal   Diffusion Policies

Anant Garg; K Madhava Krishna

arXiv:2411.10171·cs.RO·March 11, 2025

Imagine-2-Drive: Leveraging High-Fidelity World Models via Multi-Modal Diffusion Policies

Anant Garg, K Madhava Krishna

PDF

Open Access

TL;DR

Imagine-2-Drive introduces a high-fidelity world model combined with a multi-modal diffusion policy for autonomous driving, significantly enhancing policy robustness and performance with minimal online interactions.

Contribution

The paper presents a novel framework integrating a diffusion-based world model with a multi-modal diffusion policy for improved autonomous driving.

Findings

01

Outperforms prior world model baselines in CARLA benchmarks.

02

Improves Route Completion by 15% and Success Rate by 20%.

03

Mitigates error accumulation with a diffusion-based approach.

Abstract

World Model-based Reinforcement Learning (WMRL) enables sample efficient policy learning by reducing the need for online interactions which can potentially be costly and unsafe, especially for autonomous driving. However, existing world models often suffer from low prediction fidelity and compounding one-step errors, leading to policy degradation over long horizons. Additionally, traditional RL policies, often deterministic or single Gaussian-based, fail to capture the multi-modal nature of decision-making in complex driving scenarios. To address these challenges, we propose Imagine-2-Drive, a novel WMRL framework that integrates a high-fidelity world model with a multi-modal diffusion-based policy actor. It consists of two key components: DiffDreamer, a diffusion-based world model that generates future observations simultaneously, mitigating error accumulation, and DPA (Diffusion…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAutonomous Vehicle Technology and Safety · Computer Graphics and Visualization Techniques · Simulation Techniques and Applications

MethodsEntropy Regularization · Proximal Policy Optimization · CARLA: An Open Urban Driving Simulator · Diffusion