Diverse Skill Discovery for Quadruped Robots via Unsupervised Learning
Ruopeng Cui, Yifei Bi, Haojie Luo, Wei Li

TL;DR
This paper introduces an unsupervised skill discovery method for quadruped robots that enhances diversity and efficiency of learned behaviors by preventing skill overlap and reward hacking through novel architecture and multi-discriminator design.
Contribution
The paper proposes the Orthogonal Mixture-of-Experts architecture and a multi-discriminator framework to improve skill diversity and training efficiency in unsupervised learning for quadruped robots.
Findings
Achieved an 18.3% increase in state-space coverage.
Demonstrated diverse locomotion skills on a 12-DOF quadruped.
Enhanced training efficiency compared to baseline methods.
Abstract
Reinforcement learning necessitates meticulous reward shaping by specialists to elicit target behaviors, while imitation learning relies on costly task-specific data. In contrast, unsupervised skill discovery can potentially reduce these burdens by learning a diverse repertoire of useful skills driven by intrinsic motivation. However, existing methods exhibit two key limitations: they typically rely on a single policy to master a versatile repertoire of behaviors without modeling the shared structure or distinctions among them, which results in low learning efficiency; moreover, they are susceptible to reward hacking, where the reward signal increases and converges rapidly while the learned skills display insufficient actual diversity. In this work, we introduce an Orthogonal Mixture-of-Experts (OMoE) architecture that prevents diverse behaviors from collapsing into overlapping…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRobotic Locomotion and Control · Reinforcement Learning in Robotics · Social Robot Interaction and HRI
