Towards a Standardised Performance Evaluation Protocol for Cooperative MARL
Rihab Gorsane, Omayma Mahjoub, Ruan de Kock, Roland Dubb, Siddarth, Singh, Arnu Pretorius

TL;DR
This paper analyzes evaluation methods in cooperative multi-agent reinforcement learning, identifies issues, and proposes a standardized protocol to improve research validity, reproducibility, and progress measurement.
Contribution
It provides a comprehensive meta-analysis of past work and introduces a novel, standardized evaluation protocol for cooperative MARL research.
Findings
Current evaluation practices are inconsistent and unreliable.
A standardized protocol can enhance comparability and credibility.
Publicly released meta-analysis data supports future research.
Abstract
Multi-agent reinforcement learning (MARL) has emerged as a useful approach to solving decentralised decision-making problems at scale. Research in the field has been growing steadily with many breakthrough algorithms proposed in recent years. In this work, we take a closer look at this rapid development with a focus on evaluation methodologies employed across a large body of research in cooperative MARL. By conducting a detailed meta-analysis of prior work, spanning 75 papers accepted for publication from 2016 to 2022, we bring to light worrying trends that put into question the true rate of progress. We further consider these trends in a wider context and take inspiration from single-agent RL literature on similar issues with recommendations that remain applicable to MARL. Combining these recommendations, with novel insights from our analysis, we propose a standardised performance…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMobile Crowdsensing and Crowdsourcing · Traffic control and management · Reinforcement Learning in Robotics
