Towards a Standardised Performance Evaluation Protocol for Cooperative   MARL

Rihab Gorsane; Omayma Mahjoub; Ruan de Kock; Roland Dubb; Siddarth; Singh; Arnu Pretorius

arXiv:2209.10485·cs.LG·September 22, 2022·5 cites

Towards a Standardised Performance Evaluation Protocol for Cooperative MARL

Rihab Gorsane, Omayma Mahjoub, Ruan de Kock, Roland Dubb, Siddarth, Singh, Arnu Pretorius

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper analyzes evaluation methods in cooperative multi-agent reinforcement learning, identifies issues, and proposes a standardized protocol to improve research validity, reproducibility, and progress measurement.

Contribution

It provides a comprehensive meta-analysis of past work and introduces a novel, standardized evaluation protocol for cooperative MARL research.

Findings

01

Current evaluation practices are inconsistent and unreliable.

02

A standardized protocol can enhance comparability and credibility.

03

Publicly released meta-analysis data supports future research.

Abstract

Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving decentralised decision-making problems at scale. Research in the field has been growing steadily with many breakthrough algorithms proposed in recent years. In this work, we take a closer look at this rapid development with a focus on evaluation methodologies employed across a large body of research in cooperative MARL. By conducting a detailed meta-analysis of prior work, spanning 75 papers accepted for publication from 2016 to 2022, we bring to light worrying trends that put into question the true rate of progress. We further consider these trends in a wider context and take inspiration from single-agent RL literature on similar issues with recommendations that remain applicable to MARL. Combining these recommendations, with novel insights from our analysis, we propose a standardised performance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

instadeepai/marl-eval
jax

Videos

Towards a Standardised Performance Evaluation Protocol for Cooperative MARL· slideslive

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Traffic control and management · Reinforcement Learning in Robotics