Learning to Walk Autonomously via Reset-Free Quality-Diversity
Bryan Lim, Alexander Reichenbach, Antoine Cully

TL;DR
This paper introduces RF-QD, a novel method enabling autonomous, reset-free learning of complex behaviors like walking in robots, significantly reducing manual intervention and improving sample efficiency in real-world environments.
Contribution
The paper presents RF-QD, a new approach that combines behavior selection and diversity to enable autonomous, reset-free learning in robotics, extending quality-diversity algorithms to real-world applications.
Findings
RF-QD learns locomotion controllers without manual resets.
High sample efficiency achieved in real-world experiments.
Diverse solutions improve behavior selection and robustness.
Abstract
Quality-Diversity (QD) algorithms can discover large and complex behavioural repertoires consisting of both diverse and high-performing skills. However, the generation of behavioural repertoires has mainly been limited to simulation environments instead of real-world learning. This is because existing QD algorithms need large numbers of evaluations as well as episodic resets, which require manual human supervision and interventions. This paper proposes Reset-Free Quality-Diversity optimization (RF-QD) as a step towards autonomous learning for robotics in open-ended environments. We build on Dynamics-Aware Quality-Diversity (DA-QD) and introduce a behaviour selection policy that leverages the diversity of the imagined repertoire and environmental information to intelligently select of behaviours that can act as automatic resets. We demonstrate this through a task of learning to walk…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
