Integrating Diffusion-based Multi-task Learning with Online Reinforcement Learning for Robust Quadruped Robot Control

Xinyao Qin; Xiaoteng Ma; Yang Qi; Qihan Liu; Chuanyi Xue; Ning Gui; Qinyu Dong; Jun Yang; Bin Liang

arXiv:2507.05674·cs.RO·September 15, 2025

Integrating Diffusion-based Multi-task Learning with Online Reinforcement Learning for Robust Quadruped Robot Control

Xinyao Qin, Xiaoteng Ma, Yang Qi, Qihan Liu, Chuanyi Xue, Ning Gui, Qinyu Dong, Jun Yang, Bin Liang

PDF

Open Access 1 Repo

TL;DR

This paper introduces DMLoco, a diffusion-based multi-task learning framework combined with online reinforcement learning for robust, language-conditioned quadruped robot control, enabling efficient, real-time adaptation and task transition.

Contribution

It presents a novel integration of diffusion models with online RL for multi-task, language-guided quadruped control, addressing stability and data limitations.

Findings

01

Achieved stable, language-guided locomotion in simulation and real-world tests.

02

Enabled onboard control at 50Hz with optimized diffusion sampling.

03

Demonstrated robust task transitions and adaptability in diverse scenarios.

Abstract

Recent research has highlighted the powerful capabilities of imitation learning in robotics. Leveraging generative models, particularly diffusion models, these approaches offer notable advantages such as strong multi-task generalization, effective language conditioning, and high sample efficiency. While their application has been successful in manipulation tasks, their use in legged locomotion remains relatively underexplored, mainly due to compounding errors that affect stability and difficulties in task transition under limited data. Online reinforcement learning (RL) has demonstrated promising results in legged robot control in the past years, providing valuable insights to address these challenges. In this work, we propose DMLoco, a diffusion-based framework for quadruped robots that integrates multi-task pretraining with online PPO finetuning to enable language-conditioned control…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

queenxy/dmloco
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsIterative Learning Control Systems · Advanced Control Systems Optimization · Extremum Seeking Control Systems