Shapley Machine: A Game-Theoretic Framework for N-Agent Ad Hoc Teamwork
Jianhong Wang, Yang Li, Samuel Kaski, Jonathan Lawry

TL;DR
This paper introduces Shapley Machine, a novel game-theoretic RL framework for open multi-agent systems that assigns credit to agents in ad hoc teams using Shapley values, addressing limitations of heuristic methods.
Contribution
It models NAHT within cooperative game theory, extends the value space for dynamic scenarios, and proposes a TD($$)-like algorithm to estimate Shapley values in RL.
Findings
Shapley Machine effectively allocates credits in ad hoc teams.
Theoretical framework aligns with experimental results.
First integration of cooperative game theory with RL for multi-agent credit assignment.
Abstract
Open multi-agent systems are increasingly important in modeling real-world applications, such as smart grids, swarm robotics, etc. In this paper, we aim to investigate a recently proposed problem for open multi-agent systems, referred to as n-agent ad hoc teamwork (NAHT), where only a number of agents are controlled. Existing methods tend to be based on heuristic design and consequently lack theoretical rigor and ambiguous credit assignment among agents. To address these limitations, we model and solve NAHT through the lens of cooperative game theory. More specifically, we first model an open multi-agent system, characterized by its value, as an instance situated in a space of cooperative games, generated by a set of basis games. We then extend this space, along with the state space, to accommodate dynamic scenarios, thereby characterizing NAHT. Exploiting the justifiable assumption…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMulti-Agent Systems and Negotiation · Simulation Techniques and Applications · Distributed and Parallel Computing Systems
MethodsHigh-Order Consensuses · N-step Returns · Sparse Evolutionary Training
