TL;DR
This paper introduces a simple, scalable method for few-shot adaptation in Quality-Diversity optimization that leverages prior experience to significantly reduce the number of generations needed, without requiring backpropagation.
Contribution
It presents a novel approach to initialize QD methods using prior information from task distributions, enabling few-shot learning without backpropagation and applicable across models.
Findings
Reduces generations needed for QD in robotic tasks
Effective in both sparse and dense reward environments
No backpropagation required
Abstract
In the past few years, a considerable amount of research has been dedicated to the exploitation of previous learning experiences and the design of Few-shot and Meta Learning approaches, in problem domains ranging from Computer Vision to Reinforcement Learning based control. A notable exception, where to the best of our knowledge, little to no effort has been made in this direction is Quality-Diversity (QD) optimization. QD methods have been shown to be effective tools in dealing with deceptive minima and sparse rewards in Reinforcement Learning. However, they remain costly due to their reliance on inherently sample inefficient evolutionary processes. We show that, given examples from a task distribution, information about the paths taken by optimization in parameter space can be leveraged to build a prior population, which when used to initialize QD methods in unseen environments,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
