Efficient Last-iterate Convergence Algorithms in Solving Games

Linjian Meng; Youzhi Zhang; Zhenxing Ge; Shangdong Yang; Tianyu Ding,; Wenbin Li; Tianpei Yang; Bo An; Yang Gao

arXiv:2308.11256·cs.GT·March 19, 2025

Efficient Last-iterate Convergence Algorithms in Solving Games

Linjian Meng, Youzhi Zhang, Zhenxing Ge, Shangdong Yang, Tianyu Ding,, Wenbin Li, Tianpei Yang, Bo An, Yang Gao

PDF

Open Access

TL;DR

This paper proves last-iterate convergence of CFR$^+$ in learning Nash equilibria in extensive-form games and introduces RTCFR$^+$, a new algorithm that outperforms existing methods with strong theoretical guarantees.

Contribution

The paper demonstrates that CFR$^+$ achieves last-iterate convergence in perturbed regularized EFGs and develops RTCFR$^+$, a new algorithm with improved empirical performance and stability.

Findings

01

RTCFR$^+$ significantly outperforms existing algorithms.

02

CFR$^+$ achieves last-iterate convergence in perturbed regularized EFGs.

03

Enhanced stability of CFR$^+$ is crucial for empirical convergence.

Abstract

To establish last-iterate convergence for Counterfactual Regret Minimization (CFR) algorithms in learning a Nash equilibrium (NE) of extensive-form games (EFGs), recent studies reformulate learning an NE of the original EFG as learning the NEs of a sequence of (perturbed) regularized EFGs. Consequently, proving last-iterate convergence in solving the original EFG reduces to proving last-iterate convergence in solving (perturbed) regularized EFGs. However, the empirical convergence rates of the algorithms in these studies are suboptimal, since they do not utilize Regret Matching (RM)-based CFR algorithms to solve perturbed EFGs, which are known the exceptionally fast empirical convergence rates. Additionally, since solving multiple perturbed regularized EFGs is required, fine-tuning across all such games is infeasible, making parameter-free algorithms highly desirable. In this paper, we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Game Theory and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings