Joint Policy Search for Multi-agent Collaboration with Imperfect   Information

Yuandong Tian; Qucheng Gong; Tina Jiang

arXiv:2008.06495·cs.LG·December 8, 2020·6 cites

Joint Policy Search for Multi-agent Collaboration with Imperfect Information

Yuandong Tian, Qucheng Gong, Tina Jiang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Joint Policy Search (JPS), a novel method for improving multi-agent collaboration in imperfect information games, demonstrating superior performance in both theoretical and real-world settings like Contract Bridge.

Contribution

The paper proposes JPS, a new algorithm that decomposes game value changes into localized policy updates, enabling effective joint policy improvement without full re-evaluation.

Findings

01

JPS guarantees non-worsening of performance on tabular games.

02

JPS outperforms existing algorithms like BAD in collaborative settings.

03

JPS achieves state-of-the-art results in Contract Bridge, surpassing championship software.

Abstract

To learn good joint policies for multi-agent collaboration with imperfect information remains a fundamental challenge. While for two-player zero-sum games, coordinate-ascent approaches (optimizing one agent's policy at a time, e.g., self-play) work with guarantees, in multi-agent cooperative setting they often converge to sub-optimal Nash equilibrium. On the other hand, directly modeling joint policy changes in imperfect information game is nontrivial due to complicated interplay of policies (e.g., upstream updates affect downstream state reachability). In this paper, we show global changes of game values can be decomposed to policy changes localized at each information set, with a novel term named policy-change density. Based on this, we propose Joint Policy Search(JPS) that iteratively improves joint policies of collaborative agents in imperfect information games, without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

facebookresearch/jps
pytorchOfficial

Videos

Joint Policy Search for Multi-agent Collaboration with Imperfect Information· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Artificial Intelligence in Games · Multi-Agent Systems and Negotiation