Mitigating Relative Over-Generalization in Multi-Agent Reinforcement   Learning

Ting Zhu; Yue Jin; Jeremie Houssineau; Giovanni Montana

arXiv:2411.11099·cs.LG·November 19, 2024

Mitigating Relative Over-Generalization in Multi-Agent Reinforcement Learning

Ting Zhu, Yue Jin, Jeremie Houssineau, Giovanni Montana

PDF

Open Access

TL;DR

This paper introduces MaxMax Q-Learning (MMQ), a novel method to reduce over-generalization in multi-agent reinforcement learning, improving coordination and efficiency in cooperative tasks.

Contribution

The paper proposes MMQ, an innovative sampling-based approach that better approximates optimal joint policies, addressing the problem of relative over-generalization in decentralized multi-agent RL.

Findings

01

MMQ outperforms existing baselines in various environments.

02

Enhanced convergence and sample efficiency observed with MMQ.

03

Theoretical analysis supports MMQ's effectiveness.

Abstract

In decentralized multi-agent reinforcement learning, agents learning in isolation can lead to relative over-generalization (RO), where optimal joint actions are undervalued in favor of suboptimal ones. This hinders effective coordination in cooperative tasks, as agents tend to choose actions that are individually rational but collectively suboptimal. To address this issue, we introduce MaxMax Q-Learning (MMQ), which employs an iterative process of sampling and evaluating potential next states, selecting those with maximal Q-values for learning. This approach refines approximations of ideal state transitions, aligning more closely with the optimal joint policy of collaborating agents. We provide theoretical analysis supporting MMQ's potential and present empirical evaluations across various environments susceptible to RO. Our results demonstrate that MMQ frequently outperforms existing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural Networks and Applications · Evolutionary Algorithms and Applications

MethodsQ-Learning