A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control

Wonhyeok Choi; Shutong Ding; Minwoo Choi; Jungwan Woo; Kyumin Hwang; Jaeyeul Kim; Ye Shi; Sunghoon Im

arXiv:2601.06133·cs.LG·February 10, 2026

A Review of Online Diffusion Policy RL Algorithms for Scalable Robotic Control

Wonhyeok Choi, Shutong Ding, Minwoo Choi, Jungwan Woo, Kyumin Hwang, Jaeyeul Kim, Ye Shi, Sunghoon Im

PDF

Open Access

TL;DR

This paper reviews and empirically analyzes online diffusion policy reinforcement learning algorithms for robotic control, categorizing approaches, evaluating their performance across diverse tasks, and identifying key trade-offs and bottlenecks.

Contribution

It introduces a novel taxonomy of online DPRL algorithms, provides a comprehensive benchmark on 12 robotic tasks, and offers practical guidelines and future directions for scalable robotic learning.

Findings

01

Trade-offs between sample efficiency and scalability in different algorithm families

02

Identification of computational bottlenecks limiting real-world deployment

03

Insights into generalization and robustness across diverse robotic tasks

Abstract

Diffusion policies have emerged as a powerful approach for robotic control, demonstrating superior expressiveness in modeling multimodal action distributions compared to conventional policy networks. However, their integration with online reinforcement learning remains challenging due to fundamental incompatibilities between diffusion model training objectives and standard RL policy improvement mechanisms. This paper presents the first comprehensive review and empirical analysis of current Online Diffusion Policy Reinforcement Learning (Online DPRL) algorithms for scalable robotic control systems. We propose a novel taxonomy that categorizes existing approaches into four distinct families--Action-Gradient, Q-Weighting, Proximity-Based, and Backpropagation Through Time (BPTT) methods--based on their policy improvement mechanisms. Through extensive experiments on a unified NVIDIA Isaac…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Robot Manipulation and Learning