Preference Aligned Diffusion Planner for Quadrupedal Locomotion Control
Xinyi Yuan, Zhiwei Shang, Zifan Wang, Chenkai Wang, Zhao Shan, Meixin, Zhu, Chenjia Bai, Xuelong Li, Weiwei Wan, Kensuke Harada

TL;DR
This paper introduces a two-stage diffusion-based framework for quadrupedal locomotion control that enhances robustness and transferability with limited data, using a novel weak preference labeling method.
Contribution
It proposes a reward-agnostic, two-stage learning approach with a weak preference labeling technique to improve diffusion planner robustness and zero-shot transfer in quadrupedal robots.
Findings
Superior stability and velocity tracking in various gaits
Effective zero-shot transfer to real robots
Enhanced robustness with limited datasets
Abstract
Diffusion models demonstrate superior performance in capturing complex distributions from large-scale datasets, providing a promising solution for quadrupedal locomotion control. However, the robustness of the diffusion planner is inherently dependent on the diversity of the pre-collected datasets. To mitigate this issue, we propose a two-stage learning framework to enhance the capability of the diffusion planner under limited dataset (reward-agnostic). Through the offline stage, the diffusion planner learns the joint distribution of state-action sequences from expert datasets without using reward labels. Subsequently, we perform the online interaction in the simulation environment based on the trained offline planner, which significantly diversified the original behavior and thus improves the robustness. Specifically, we propose a novel weak preference labeling method without the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsControl and Dynamics of Mobile Robots · Robotic Path Planning Algorithms · Human Motion and Animation
