Learning to Teach in Cooperative Multiagent Reinforcement Learning

Shayegan Omidshafiei; Dong-Ki Kim; Miao Liu; Gerald Tesauro; Matthew; Riemer; Christopher Amato; Murray Campbell; Jonathan P. How

arXiv:1805.07830·cs.MA·September 5, 2018

Learning to Teach in Cooperative Multiagent Reinforcement Learning

Shayegan Omidshafiei, Dong-Ki Kim, Miao Liu, Gerald Tesauro, Matthew, Riemer, Christopher Amato, Murray Campbell, Jonathan P. How

PDF

TL;DR

This paper introduces LeCTR, a novel algorithm enabling cooperative multiagent reinforcement learning agents to learn when and what to teach each other, significantly improving learning speed and coordination without prior domain knowledge.

Contribution

It presents the first general framework and algorithm for agents to learn to teach and coordinate in multiagent reinforcement learning environments.

Findings

01

LeCTR agents learn significantly faster than state-of-the-art methods.

02

Agents effectively learn when to teach and when to be students.

03

LeCTR improves coordination in complex multiagent tasks.

Abstract

Collective human knowledge has clearly benefited from the fact that innovations by individuals are taught to others through communication. Similar to human social groups, agents in distributed learning systems would likely benefit from communication to share knowledge and teach skills. The problem of teaching to improve agent learning has been investigated by prior works, but these approaches make assumptions that prevent application of teaching to general multiagent problems, or require domain expertise for problems they can apply to. This learning to teach problem has inherent complexities related to measuring long-term impacts of teaching that compound the standard multiagent coordination challenges. In contrast to existing works, this paper presents the first general framework and algorithm for intelligent agents to learn to teach in a multiagent environment. Our algorithm, Learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.