DiWA: Diffusion Policy Adaptation with World Models

Akshay L Chandra; Iman Nematollahi; Chenguang Huang; Tim Welschehold; Wolfram Burgard; Abhinav Valada

arXiv:2508.03645·cs.RO·August 6, 2025

DiWA: Diffusion Policy Adaptation with World Models

Akshay L Chandra, Iman Nematollahi, Chenguang Huang, Tim Welschehold, Wolfram Burgard, Abhinav Valada

PDF

TL;DR

DiWA introduces an offline reinforcement learning framework using a world model to efficiently fine-tune diffusion policies for robotic skills, significantly reducing real-world interactions needed compared to prior methods.

Contribution

DiWA is the first approach to fine-tune diffusion policies for robots entirely offline with a world model, improving sample efficiency and safety.

Findings

01

Achieves effective task performance on CALVIN benchmark with offline adaptation.

02

Requires orders of magnitude fewer physical interactions than model-free baselines.

03

Demonstrates practical and safe robot skill fine-tuning using offline data.

Abstract

Fine-tuning diffusion policies with reinforcement learning (RL) presents significant challenges. The long denoising sequence for each action prediction impedes effective reward propagation. Moreover, standard RL methods require millions of real-world interactions, posing a major bottleneck for practical fine-tuning. Although prior work frames the denoising process in diffusion policies as a Markov Decision Process to enable RL-based updates, its strong dependence on environment interaction remains highly inefficient. To bridge this gap, we introduce DiWA, a novel framework that leverages a world model for fine-tuning diffusion-based robotic skills entirely offline with reinforcement learning. Unlike model-free approaches that require millions of environment interactions to fine-tune a repertoire of robot skills, DiWA achieves effective adaptation using a world model trained once on a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.