TL;DR
This paper introduces a tailored diffusion policy for robotic manipulation that uses a novel genetic denoising strategy, enabling high performance with minimal neural function evaluations and outperforming standard methods across multiple tasks.
Contribution
The work adapts diffusion models for embodied AI by designing a genetic denoising approach that improves efficiency and stability in robotic control tasks.
Findings
Achieves up to 20% performance gains over standard diffusion policies.
Operates effectively with as few as 2 neural function evaluations.
Consistently outperforms existing methods across 14 manipulation tasks.
Abstract
Diffusion models, such as diffusion policy, have achieved state-of-the-art results in robotic manipulation by imitating expert demonstrations. While diffusion models were originally developed for vision tasks like image and video generation, many of their inference strategies have been directly transferred to control domains without adaptation. In this work, we show that by tailoring the denoising process to the specific characteristics of embodied AI tasks -- particularly structured, low-dimensional nature of action distributions -- diffusion policies can operate effectively with as few as 5 neural function evaluations (NFE). Building on this insight, we propose a population-based sampling strategy, genetic denoising, which enhances both performance and stability by selecting denoising trajectories with low out-of-distribution risk. Our method solves challenging tasks with only 2 NFE…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
