On Stateful Value Factorization in Multi-Agent Reinforcement Learning

Enrico Marchesini; Andrea Baisero; Rupali Bhati; Christopher Amato

arXiv:2408.15381·cs.AI·September 11, 2024

On Stateful Value Factorization in Multi-Agent Reinforcement Learning

Enrico Marchesini, Andrea Baisero, Rupali Bhati, Christopher Amato

PDF

Open Access

TL;DR

This paper analyzes the theoretical foundations of state versus history in value factorization for multi-agent reinforcement learning, introduces DuelMIX for improved utility estimation, and demonstrates its effectiveness in complex tasks.

Contribution

It provides a formal analysis reconciling theory and practice in state-based value factorization and proposes DuelMIX, a novel algorithm with enhanced expressiveness.

Findings

01

DuelMIX outperforms existing methods on StarCraft II tasks.

02

Using state information improves the theoretical and practical performance.

03

The approach achieves full expressiveness in utility estimation.

Abstract

Value factorization is a popular paradigm for designing scalable multi-agent reinforcement learning algorithms. However, current factorization methods make choices without full justification that may limit their performance. For example, the theory in prior work uses stateless (i.e., history) functions, while the practical implementations use state information -- making the motivating theory a mismatch for the implementation. Also, methods have built off of previous approaches, inheriting their architectures without exploring other, potentially better ones. To address these concerns, we formally analyze the theory of using the state instead of the history in current methods -- reconnecting theory and practice. We then introduce DuelMIX, a factorization algorithm that learns distinct per-agent utility estimators to improve performance and achieve full expressiveness. Experiments on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScheduling and Optimization Algorithms · Reinforcement Learning in Robotics