Learning Progress Driven Multi-Agent Curriculum

Wenshuai Zhao; Zhiyuan Li; Joni Pajarinen

arXiv:2205.10016·cs.AI·May 16, 2025·1 cites

Learning Progress Driven Multi-Agent Curriculum

Wenshuai Zhao, Zhiyuan Li, Joni Pajarinen

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel curriculum learning method for multi-agent reinforcement learning that uses TD-error based learning progress to better control task difficulty, leading to improved performance on challenging benchmarks.

Contribution

It proposes a new curriculum control approach based on learning progress, addressing variance and credit assignment issues in MARL.

Findings

01

Outperforms state-of-the-art baselines on three benchmarks.

02

Uses TD-error based learning progress for curriculum control.

03

Alleviates issues of high variance and credit assignment difficulty.

Abstract

The number of agents can be an effective curriculum variable for controlling the difficulty of multi-agent reinforcement learning (MARL) tasks. Existing work typically uses manually defined curricula such as linear schemes. We identify two potential flaws while applying existing reward-based automatic curriculum learning methods in MARL: (1) The expected episode return used to measure task difficulty has high variance; (2) Credit assignment difficulty can be exacerbated in tasks where increasing the number of agents yields higher returns which is common in many MARL tasks. To address these issues, we propose to control the curriculum by using a TD-error based *learning progress* measure and by letting the curriculum proceed from an initial context distribution to the final task specific one. Since our approach maintains a distribution over the number of agents and measures learning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wenshuaizhao/spmarl
pytorchOfficial

Videos

Learning Progress Driven Multi-Agent Curriculum· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings