Optimal Coordinated Planning Amongst Self-Interested Agents with Private State
Ruggiero Cavallo, David C. Parkes, Satinder Singh

TL;DR
This paper develops an incentive-compatible mechanism for optimal coordinated planning among self-interested agents with private states in dynamic environments, ensuring social optimality and distributed computation.
Contribution
It introduces a mechanism that elicits private state information and achieves optimal joint policies in Markov perfect equilibrium, with efficient algorithms for special cases.
Findings
Achieves social optimality in multi-agent MDPs with private states
Provides a distributed algorithm leveraging Gittins indices for special cases
Extends to multi-armed bandit problems with multi-agent coordination
Abstract
Consider a multi-agent system in a dynamic and uncertain environment. Each agent's local decision problem is modeled as a Markov decision process (MDP) and agents must coordinate on a joint action in each period, which provides a reward to each agent and causes local state transitions. A social planner knows the model of every agent's MDP and wants to implement the optimal joint policy, but agents are self-interested and have private local state. We provide an incentive-compatible mechanism for eliciting state information that achieves the optimal joint plan in a Markov perfect equilibrium of the induced stochastic game. In the special case in which local problems are Markov chains and agents compete to take a single action in each period, we leverage Gittins allocation indices to provide an efficient factored algorithm and distribute computation of the optimal policy among the agents.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAuction Theory and Applications · Game Theory and Applications · Advanced Bandit Algorithms Research
