Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting   Pot

Joel Z. Leibo; Edgar Du\'e\~nez-Guzm\'an; Alexander Sasha Vezhnevets,; John P. Agapiou; Peter Sunehag; Raphael Koster; Jayd Matyas; Charles Beattie,; Igor Mordatch; Thore Graepel

arXiv:2107.06857·cs.MA·July 15, 2021·23 cites

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot

Joel Z. Leibo, Edgar Du\'e\~nez-Guzm\'an, Alexander Sasha Vezhnevets,, John P. Agapiou, Peter Sunehag, Raphael Koster, Jayd Matyas, Charles Beattie,, Igor Mordatch, Thore Graepel

PDF

Open Access 1 Video

TL;DR

Melting Pot is a scalable evaluation suite for multi-agent reinforcement learning that assesses generalization to new situations using RL-generated test scenarios, revealing weaknesses in algorithms beyond training performance.

Contribution

It introduces Melting Pot, a novel MARL evaluation framework that automates the creation of diverse test scenarios to evaluate generalization capabilities.

Findings

01

Over 80 diverse test scenarios created.

02

Reveals weaknesses in MARL algorithms not seen during training.

03

Demonstrates the importance of evaluation beyond training metrics.

Abstract

Existing evaluation suites for multi-agent reinforcement learning (MARL) do not assess generalization to novel situations as their primary objective (unlike supervised-learning benchmarks). Our contribution, Melting Pot, is a MARL evaluation suite that fills this gap, and uses reinforcement learning to reduce the human labor required to create novel test scenarios. This works because one agent's behavior constitutes (part of) another agent's environment. To demonstrate scalability, we have created over 80 unique test scenarios covering a broad range of research topics such as social dilemmas, reciprocity, resource sharing, and task partitioning. We apply these test scenarios to standard MARL training algorithms, and demonstrate how Melting Pot reveals weaknesses not apparent from training performance alone.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Scalable Evaluation of Multi-Agent Reinforcement Learning with Melting Pot· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics