Conflux-PSRO: Effectively Leveraging Collective Advantages in Policy   Space Response Oracles

Yucong Huang; Jiesong Lian; Mingzhi Wang; Chengdong Ma; Ying Wen

arXiv:2410.22776·cs.GT·November 14, 2024

Conflux-PSRO: Effectively Leveraging Collective Advantages in Policy Space Response Oracles

Yucong Huang, Jiesong Lian, Mingzhi Wang, Chengdong Ma, Ying Wen

PDF

Open Access

TL;DR

Conflux-PSRO introduces a novel approach that adaptively selects and trains policies at the state level to leverage diversity effectively, improving Nash Equilibrium approximation in complex zero-sum games.

Contribution

It proposes a state-level adaptive policy selection and training method that fully exploits population diversity, enhancing performance and reducing exploitability in PSRO algorithms.

Findings

01

Significantly improves the utility of best responses.

02

Reduces exploitability compared to existing methods.

03

Enhances performance across various environments.

Abstract

Policy Space Response Oracle (PSRO) with policy population construction has been demonstrated as an effective method for approximating Nash Equilibrium (NE) in zero-sum games. Existing studies have attempted to improve diversity in policy space, primarily by incorporating diversity regularization into the Best Response (BR). However, these methods cause the BR to deviate from maximizing rewards, easily resulting in a population that favors diversity over performance, even when diversity is not always necessary. Consequently, exploitability is difficult to reduce until policies are fully explored, especially in complex games. In this paper, we propose Conflux-PSRO, which fully exploits the diversity of the population by adaptively selecting and training policies at state-level. Specifically, Conflux-PSRO identifies useful policies from the existing population and employs a routing policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsE-Government and Public Services · Crime, Illicit Activities, and Governance