Procedural generation of meta-reinforcement learning tasks
Thomas Miconi

TL;DR
This paper introduces a flexible parametrized space for generating diverse meta-reinforcement learning tasks, enabling the creation of numerous novel environments that include many well-known benchmarks and complex topological tasks.
Contribution
It proposes a new expressive parametrization for meta-RL tasks, allowing for random generation of diverse environments, expanding the scope of meta-learning research.
Findings
Generated a variety of meta-RL tasks including bandits, mazes, and topological spaces.
Demonstrated the ability to produce many well-known meta-RL benchmarks.
Discussed challenges and potential issues in random task generation.
Abstract
Open-endedness stands to benefit from the ability to generate an infinite variety of diverse, challenging environments. One particularly interesting type of challenge is meta-learning ("learning-to-learn"), a hallmark of intelligent behavior. However, the number of meta-learning environments in the literature is limited. Here we describe a parametrized space for simple meta-reinforcement learning (meta-RL) tasks with arbitrary stimuli. The parametrization allows us to randomly generate an arbitrary number of novel simple meta-learning tasks. The parametrization is expressive enough to include many well-known meta-RL tasks, such as bandit problems, the Harlow task, T-mazes, the Daw two-step task and others. Simple extensions allow it to capture tasks based on two-dimensional topological spaces, such as full mazes or find-the-spot domains. We describe a number of randomly generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification
