Emergent Tool Use From Multi-Agent Autocurricula

Bowen Baker; Ingmar Kanitscheider; Todor Markov; Yi Wu; Glenn Powell,; Bob McGrew; Igor Mordatch

arXiv:1909.07528·cs.LG·February 12, 2020·335 cites

Emergent Tool Use From Multi-Agent Autocurricula

Bowen Baker, Ingmar Kanitscheider, Todor Markov, Yi Wu, Glenn Powell,, Bob McGrew, Igor Mordatch

PDF

Open Access 3 Repos 4 Models

TL;DR

This paper demonstrates that multi-agent competition with simple objectives and reinforcement learning induces complex, emergent behaviors including tool use and coordination, with potential scalability and relevance to human skills.

Contribution

It reveals six distinct emergent strategy phases driven by self-supervised autocurricula in multi-agent hide-and-seek, highlighting complex tool use and coordination.

Findings

01

Agents develop multi-object shelters using moveable boxes

02

Agents discover obstacle overcoming with ramps

03

Multi-agent competition scales better with environment complexity

Abstract

Through multi-agent competition, the simple objective of hide-and-seek, and standard reinforcement learning algorithms at scale, we find that agents create a self-supervised autocurriculum inducing multiple distinct rounds of emergent strategy, many of which require sophisticated tool use and coordination. We find clear evidence of six emergent phases in agent strategy in our environment, each of which creates a new pressure for the opposing team to adapt; for instance, agents learn to build multi-object shelters using moveable boxes which in turn leads to agents discovering that they can overcome obstacles using ramps. We further provide evidence that multi-agent competition may scale better with increasing environment complexity and leads to behavior that centers around far more human-relevant skills than other self-supervised reinforcement learning methods such as intrinsic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Artificial Intelligence in Games