Adaptive Diffusion Policy Optimization for Robotic Manipulation
Huiyun Jiang, Zhuang Yang

TL;DR
This paper introduces ADPO, a novel adaptive gradient-based framework for efficiently fine-tuning diffusion policies in robotic control, demonstrating superior or comparable performance to existing methods.
Contribution
The paper presents ADPO, the first adaptive gradient descent approach tailored for diffusion policy optimization in robotic reinforcement learning tasks.
Findings
ADPO outperforms existing diffusion RL methods in effectiveness.
ADPO achieves comparable or better performance on standard robotic tasks.
Hyperparameter sensitivity analysis provides practical guidance for application.
Abstract
Recent studies have shown the great potential of diffusion models in improving reinforcement learning (RL) by modeling complex policies, expressing a high degree of multi-modality, and efficiently handling high-dimensional continuous control tasks. However, there is currently limited research on how to optimize diffusion-based polices (e.g., Diffusion Policy) fast and stably. In this paper, we propose an Adam-based Diffusion Policy Optimization (ADPO), a fast algorithmic framework containing best practices for fine-tuning diffusion-based polices in robotic control tasks using the adaptive gradient descent method in RL. Adaptive gradient method is less studied in training RL, let alone diffusion-based policies. We confirm that ADPO outperforms other diffusion-based RL methods in terms of overall effectiveness for fine-tuning on standard robotic tasks. Concretely, we conduct extensive…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDiffusion
