Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

Jiesong Lian; Yucong Huang; Chengdong Ma; Mingzhi Wang; Ying Wen; Long Hu; Yixue Hao

arXiv:2405.21027·cs.GT·January 6, 2026

Fusion-PSRO: Nash Policy Fusion for Policy Space Response Oracles

Jiesong Lian, Yucong Huang, Chengdong Ma, Mingzhi Wang, Ying Wen, Long Hu, Yixue Hao

PDF

Open Access

TL;DR

Fusion-PSRO introduces Nash Policy Fusion to improve policy initialization in PSRO, leveraging past policies and dynamic weighting to better approximate Nash Equilibrium in zero-sum games.

Contribution

It proposes a novel Nash Policy Fusion method for PSRO, enhancing policy initialization and convergence to NE by utilizing past policies and adaptive weighting.

Findings

01

Achieves lower exploitability on benchmark games

02

Improves policy population quality over iterations

03

Mitigates previous initialization shortcomings

Abstract

For solving zero-sum games involving non-transitivity, a useful approach is to maintain a policy population to approximate the Nash Equilibrium (NE). Previous studies have shown that the Policy Space Response Oracles (PSRO) algorithm is an effective framework for solving such games. However, current methods initialize a new policy from scratch or inherit a single historical policy in Best Response (BR), missing the opportunity to leverage past policies to generate a better BR. In this paper, we propose Fusion-PSRO, which employs Nash Policy Fusion to initialize a new policy for BR training. Nash Policy Fusion serves as an implicit guiding policy that starts exploration on the current Meta-NE, thus providing a closer approximation to BR. Moreover, it insightfully captures a weighted moving average of past policies, dynamically adjusting these weights based on the Meta-NE in each…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAccess Control and Trust

MethodsBalanced Selection