Simulation-Free PSRO: Removing Game Simulation from Policy Space Response Oracles
Yingzhuo Liu, Shuodi Liu, Weijun Luo, Liuyu Xiang, Zhaofeng He

TL;DR
This paper introduces Simulation-Free PSRO, a novel approach that eliminates game simulation from PSRO, significantly reducing computational costs while maintaining effectiveness in approximating Nash Equilibrium.
Contribution
It proposes a Dynamic Window-based Simulation-Free PSRO that simplifies opponent strategy selection and enhances robustness, with a strategy window and Nash Clustering for strategy management.
Findings
Reduces exploitability compared to existing methods
Significantly decreases computational costs
Maintains strong approximation of Nash Equilibrium
Abstract
Policy Space Response Oracles (PSRO) combines game-theoretic equilibrium computation with learning and is effective in approximating Nash Equilibrium in zero-sum games. However, the computational cost of PSRO has become a significant limitation to its practical application. Our analysis shows that game simulation is the primary bottleneck in PSRO's runtime. To address this issue, we conclude the concept of Simulation-Free PSRO and summarize existing methods that instantiate this concept. Additionally, we propose a novel Dynamic Window-based Simulation-Free PSRO, which introduces the concept of a strategy window to replace the original strategy set maintained in PSRO. The number of strategies in the strategy window is limited, thereby simplifying opponent strategy selection and improving the robustness of the best response. Moreover, we use Nash Clustering to select the strategy to be…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Adaptive Dynamic Programming Control
