CURO: Curriculum Learning for Relative Overgeneralization
Lin Shi, Qiyuan Liu, Bei Peng

TL;DR
CURO introduces a curriculum learning approach that fine-tunes reward functions and employs transfer learning to effectively address relative overgeneralization in multi-agent reinforcement learning, improving coordination and task success.
Contribution
The paper proposes CURO, a novel curriculum learning method combining reward fine-tuning and transfer learning to overcome relative overgeneralization in MARL, applicable to various algorithms.
Findings
CURO successfully overcomes severe RO in multiple MARL algorithms.
CURO improves performance and coordination in challenging cooperative tasks.
CURO outperforms baseline methods in diverse multi-agent environments.
Abstract
Relative overgeneralization (RO) is a pathology that can arise in cooperative multi-agent tasks when the optimal joint action's utility falls below that of a sub-optimal joint action. RO can cause the agents to get stuck into local optima or fail to solve cooperative tasks requiring significant coordination between agents within a given timestep. In this work, we empirically find that, in multi-agent reinforcement learning (MARL), both value-based and policy gradient MARL algorithms can suffer from RO and fail to learn effective coordination policies. To better overcome RO, we propose a novel approach called curriculum learning for relative overgeneralization (CURO). To solve a target task that exhibits strong RO, in CURO, we first fine-tune the reward function of the target task to generate source tasks to train the agent. Then, to effectively transfer the knowledge acquired in one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMosquito-borne diseases and control
Methodsfail
