Human-Timescale Adaptation in an Open-Ended Task Space
Adaptive Agent Team, Jakob Bauer, Kate Baumli, Satinder Baveja, Feryal, Behbahani, Avishkar Bhoopchand, Nathalie Bradley-Schmieg, Michael Chang,, Natalie Clay, Adrian Collister, Vibhavari Dasagi, Lucy Gonzalez, Karol, Gregor, Edward Hughes, Sheleem Kashem, Maria Loks-Thompson

TL;DR
This paper shows that large-scale reinforcement learning with meta-learning and attention-based memory enables agents to adapt quickly to new, open-ended 3D environments, resembling human-like in-context learning capabilities.
Contribution
It introduces a novel RL agent trained with meta-learning, attention-based memory, and curriculum learning, achieving rapid adaptation in open-ended 3D tasks.
Findings
Scaling laws relate network size, memory, and task diversity to performance.
The adaptive agent demonstrates hypothesis-driven exploration and efficient knowledge use.
Prompting with demonstrations enhances adaptation success.
Abstract
Foundation models have shown impressive adaptation and scalability in supervised and self-supervised learning problems, but so far these successes have not fully translated to reinforcement learning (RL). In this work, we demonstrate that training an RL agent at scale leads to a general in-context learning algorithm that can adapt to open-ended novel embodied 3D problems as quickly as humans. In a vast space of held-out environment dynamics, our adaptive agent (AdA) displays on-the-fly hypothesis-driven exploration, efficient exploitation of acquired knowledge, and can successfully be prompted with first-person demonstrations. Adaptation emerges from three ingredients: (1) meta-reinforcement learning across a vast, smooth and diverse task distribution, (2) a policy parameterised as a large-scale attention-based memory architecture, and (3) an effective automated curriculum that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques
