Learning and Computation of $\Phi$-Equilibria at the Frontier of Tractability

Brian Hu Zhang; Ioannis Anagnostides; Emanuel Tewolde; Ratip Emin Berker; Gabriele Farina; Vincent Conitzer; Tuomas Sandholm

arXiv:2502.18582·stat.ML·December 16, 2025

Learning and Computation of $\Phi$-Equilibria at the Frontier of Tractability

Brian Hu Zhang, Ioannis Anagnostides, Emanuel Tewolde, Ratip Emin Berker, Gabriele Farina, Vincent Conitzer, Tuomas Sandholm

PDF

Open Access

TL;DR

This paper extends the computational framework for $\

Contribution

It introduces algorithms for computing $\

Findings

01

Polynomial-time algorithm for $\

02

Nearly matching lower bounds established

03

Extension of $\

Abstract

$Φ$ -equilibria -- and the associated notion of $Φ$ -regret -- are a powerful and flexible framework at the heart of online learning and game theory, whereby enriching the set of deviations $Φ$ begets stronger notions of rationality. Recently, Daskalakis, Farina, Fishelson, Pipis, and Schneider (STOC '24) -- abbreviated as DFFPS -- settled the existence of efficient algorithms when $Φ$ contains only linear maps under a general, $d$ -dimensional convex constraint set $X$ . In this paper, we significantly extend their work by resolving the case where $Φ$ is $k$ -dimensional; degree- $ℓ$ polynomials constitute a canonical such example with $k = d^{O (ℓ)}$ . In particular, positing only oracle access to $X$ , we obtain two main positive results: i) a $poly (n, d, k, log (1/ ϵ))$ -time algorithm for computing $ϵ$ -approximate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics

MethodsSparse Evolutionary Training