Open-Ended Learning Leads to Generally Capable Agents

Open Ended Learning Team; Adam Stooke; Anuj Mahajan; Catarina Barros,; Charlie Deck; Jakob Bauer; Jakub Sygnowski; Maja Trebacz; Max Jaderberg,; Michael Mathieu; Nat McAleese; Nathalie Bradley-Schmieg; Nathaniel Wong,; Nicolas Porcel; Roberta Raileanu; Steph Hughes-Fitt; Valentin Dalibard,; Wojciech Marian Czarnecki

arXiv:2107.12808·cs.LG·August 3, 2021·54 cites

Open-Ended Learning Leads to Generally Capable Agents

Open Ended Learning Team, Adam Stooke, Anuj Mahajan, Catarina Barros,, Charlie Deck, Jakob Bauer, Jakub Sygnowski, Maja Trebacz, Max Jaderberg,, Michael Mathieu, Nat McAleese, Nathalie Bradley-Schmieg, Nathaniel Wong,, Nicolas Porcel, Roberta Raileanu, Steph Hughes-Fitt

PDF

Open Access

TL;DR

This paper presents a method for creating generally capable agents through open-ended learning in a diverse, multi-agent environment, leading to broad generalization and emergent behaviors across many tasks.

Contribution

It introduces an open-ended learning framework that dynamically adapts training tasks and objectives, enabling agents to continually learn and generalize across a vast space of challenges.

Findings

01

Agents achieve zero-shot performance on multiple complex tasks.

02

Emergent behaviors include tool use and cooperation.

03

Agents can be fine-tuned for larger scale transfer.

Abstract

In this work we create agents that can perform well beyond a single, individual task, that exhibit much wider generalisation of behaviour to a massive, rich space of challenges. We define a universe of tasks within an environment domain and demonstrate the ability to train agents that are generally capable across this vast space and beyond. The environment is natively multi-agent, spanning the continuum of competitive, cooperative, and independent games, which are situated within procedurally generated physical 3D worlds. The resulting space is exceptionally diverse in terms of the challenges posed to agents, and as such, even measuring the learning progress of an agent is an open research problem. We propose an iterative notion of improvement between successive generations of agents, rather than seeking to maximise a singular objective, allowing us to quantify progress despite tasks…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Data Stream Mining Techniques