On the Power of Perturbation under Sampling in Solving Extensive-Form Games

Wataru Masaka; Mitsuki Sakamoto; Kenshi Abe; Kaito Ariu; Tuomas Sandholm; Atsushi Iwasaki

arXiv:2501.16600·cs.GT·August 5, 2025

On the Power of Perturbation under Sampling in Solving Extensive-Form Games

Wataru Masaka, Mitsuki Sakamoto, Kenshi Abe, Kaito Ariu, Tuomas Sandholm, Atsushi Iwasaki

PDF

Open Access

TL;DR

This paper explores how perturbation techniques, specifically in the form of Perturbed FTRL algorithms, can stabilize learning and improve convergence in solving imperfect-information extensive-form games with sampling noise, introducing two variants and analyzing their performance.

Contribution

It introduces a unified framework for Perturbed FTRL algorithms, including PFTRL-KL and PFTRL-RKL, and demonstrates their effectiveness and variance-reduction benefits in game-solving scenarios.

Findings

01

PFTRL-RKL outperforms PFTRL-KL in asymmetric Leduc poker.

02

Variance reduction in RKL improves last-iterate convergence.

03

Perturbation stabilizes learning under sampling noise.

Abstract

We investigate how perturbation does and does not improve the Follow-the-Regularized-Leader (FTRL) algorithm in solving imperfect-information extensive-form games under sampling, where payoffs are estimated from sampled trajectories. While optimistic algorithms are effective under full feedback, they often become unstable in the presence of sampling noise. Payoff perturbation offers a promising alternative for stabilizing learning and achieving \textit{last-iterate convergence}. We present a unified framework for \textit{Perturbed FTRL} algorithms and study two variants: PFTRL-KL (standard KL divergence) and PFTRL-RKL (Reverse KL divergence), the latter featuring an estimator with both unbiasedness and conditional zero variance. While PFTRL-KL generally achieves equivalent or better performance across benchmark games, PFTRL-RKL consistently outperforms it in Leduc poker, whose structure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games