Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control

Xinyi Yuan; Zhiwei Shang; Zifan Wang; Chenkai Wang; Zhao Shan; Meixin; Zhu; Chenjia Bai; Xuelong Li; Weiwei Wan; Kensuke Harada

arXiv:2410.13586·cs.RO·March 4, 2025

Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control

Xinyi Yuan, Zhiwei Shang, Zifan Wang, Chenkai Wang, Zhao Shan, Meixin, Zhu, Chenjia Bai, Xuelong Li, Weiwei Wan, Kensuke Harada

PDF

Open Access

TL;DR

This paper introduces a two-stage diffusion-based framework for quadrupedal locomotion control that enhances robustness and transferability with limited data, using a novel weak preference labeling method.

Contribution

It proposes a reward-agnostic, two-stage learning approach with a weak preference labeling technique to improve diffusion planner robustness and zero-shot transfer in quadrupedal robots.

Findings

01

Superior stability and velocity tracking in various gaits

02

Effective zero-shot transfer to real robots

03

Enhanced robustness with limited datasets

Abstract

Diffusion models demonstrate superior performance in capturing complex distributions from large-scale datasets, providing a promising solution for quadrupedal locomotion control. However, the robustness of the diffusion planner is inherently dependent on the diversity of the pre-collected datasets. To mitigate this issue, we propose a two-stage learning framework to enhance the capability of the diffusion planner under limited dataset (reward-agnostic). Through the offline stage, the diffusion planner learns the joint distribution of state-action sequences from expert datasets without using reward labels. Subsequently, we perform the online interaction in the simulation environment based on the trained offline planner, which significantly diversified the original behavior and thus improves the robustness. Specifically, we propose a novel weak preference labeling method without the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsControl and Dynamics of Mobile Robots · Robotic Path Planning Algorithms · Human Motion and Animation