ReMIX: Regret Minimization for Monotonic Value Function Factorization in   Multiagent Reinforcement Learning

Yongsheng Mei; Hanhan Zhou; Tian Lan

arXiv:2302.05593·cs.LG·February 14, 2023·5 cites

ReMIX: Regret Minimization for Monotonic Value Function Factorization in Multiagent Reinforcement Learning

Yongsheng Mei, Hanhan Zhou, Tian Lan

PDF

Open Access

TL;DR

ReMIX introduces a regret minimization approach to optimize monotonic value function factorization in multiagent reinforcement learning, effectively handling non-monotonic environments and improving cooperative decision-making.

Contribution

The paper proposes a novel regret minimization framework for optimal projection in monotonic value function factorization, addressing representational limitations.

Findings

01

Outperforms existing methods on Predator-Prey and StarCraft environments.

02

Effectively handles non-monotonic value functions in multiagent settings.

03

Reduces the gap between optimal and restricted monotonic functions.

Abstract

Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized decision-making. Nevertheless, the use of monotonic mixing functions also induces representational limitations. Finding the optimal projection of an unrestricted mixing function onto monotonic function classes is still an open problem. To this end, we propose ReMIX, formulating this optimal projection problem for value function factorization as a regret minimization over the projection weights of different state-action values. Such an optimization problem can be relaxed and solved using the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)