ReMIX: Regret Minimization for Monotonic Value Function Factorization in Multiagent Reinforcement Learning
Yongsheng Mei, Hanhan Zhou, Tian Lan

TL;DR
ReMIX introduces a regret minimization approach to optimize monotonic value function factorization in multiagent reinforcement learning, effectively handling non-monotonic environments and improving cooperative decision-making.
Contribution
The paper proposes a novel regret minimization framework for optimal projection in monotonic value function factorization, addressing representational limitations.
Findings
Outperforms existing methods on Predator-Prey and StarCraft environments.
Effectively handles non-monotonic value functions in multiagent settings.
Reduces the gap between optimal and restricted monotonic functions.
Abstract
Value function factorization methods have become a dominant approach for cooperative multiagent reinforcement learning under a centralized training and decentralized execution paradigm. By factorizing the optimal joint action-value function using a monotonic mixing function of agents' utilities, these algorithms ensure the consistency between joint and local action selections for decentralized decision-making. Nevertheless, the use of monotonic mixing functions also induces representational limitations. Finding the optimal projection of an unrestricted mixing function onto monotonic function classes is still an open problem. To this end, we propose ReMIX, formulating this optimal projection problem for value function factorization as a regret minimization over the projection weights of different state-action values. Such an optimization problem can be relaxed and solved using the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
