D3P: Dynamic Denoising Diffusion Policy via Reinforcement Learning
Shu-Ang Yu, Feng Gao, Yi Wu, Chao Yu, Yu Wang

TL;DR
D3P introduces a reinforcement learning-based adaptive denoising approach for diffusion policies, significantly speeding up robotic visuomotor tasks by allocating denoising steps dynamically based on action importance.
Contribution
The paper proposes D3P, a novel adaptive denoising diffusion policy that allocates denoising steps dynamically using reinforcement learning, improving inference speed without sacrificing success rate.
Findings
Achieves 2.2× inference speed-up in simulation.
Demonstrates 1.9× acceleration on real robot.
Maintains task success despite faster inference.
Abstract
Diffusion policies excel at learning complex action distributions for robotic visuomotor tasks, yet their iterative denoising process poses a major bottleneck for real-time deployment. Existing acceleration methods apply a fixed number of denoising steps per action, implicitly treating all actions as equally important. However, our experiments reveal that robotic tasks often contain a mix of \emph{crucial} and \emph{routine} actions, which differ in their impact on task success. Motivated by this finding, we propose \textbf{D}ynamic \textbf{D}enoising \textbf{D}iffusion \textbf{P}olicy \textbf{(D3P)}, a diffusion-based policy that adaptively allocates denoising steps across actions at test time. D3P uses a lightweight, state-aware adaptor to allocate the optimal number of denoising steps for each action. We jointly optimize the adaptor and base diffusion policy via reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
