Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed   Cooperative-Competitive Games

Zelai Xu; Yancheng Liang; Chao Yu; Yu Wang; Yi Wu

arXiv:2310.03354·cs.AI·October 6, 2023

Fictitious Cross-Play: Learning Global Nash Equilibrium in Mixed Cooperative-Competitive Games

Zelai Xu, Yancheng Liang, Chao Yu, Yu Wang, Yi Wu

PDF

Open Access

TL;DR

This paper introduces Fictitious Cross-Play (FXP), a novel algorithm that effectively learns global Nash equilibria in mixed cooperative-competitive games by combining self-play and best response strategies, outperforming existing methods.

Contribution

FXP combines self-play and best response training to efficiently converge to global Nash equilibria in complex mixed games, overcoming scalability issues of prior approaches.

Findings

01

FXP converges to global Nash equilibria in matrix games.

02

FXP achieves higher Elo ratings and lower exploitability in gridworld domain.

03

FXP defeats state-of-the-art models in a challenging football game with over 94% win rate.

Abstract

Self-play (SP) is a popular multi-agent reinforcement learning (MARL) framework for solving competitive games, where each agent optimizes policy by treating others as part of the environment. Despite the empirical successes, the theoretical properties of SP-based methods are limited to two-player zero-sum games. However, for mixed cooperative-competitive games where agents on the same team need to cooperate with each other, we can show a simple counter-example where SP-based methods cannot converge to a global Nash equilibrium (NE) with high probability. Alternatively, Policy-Space Response Oracles (PSRO) is an iterative framework for learning NE, where the best responses w.r.t. previous policies are learned in each iteration. PSRO can be directly extended to mixed cooperative-competitive settings by jointly learning team best responses with all convergence properties unchanged.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance