Polynomial-Time Linear-Swap Regret Minimization in Imperfect-Information Sequential Games
Gabriele Farina, Charilaos Pipis

TL;DR
This paper introduces an efficient method for sequential games that minimizes linear-swap regret, enabling the convergence to a new class of equilibria called linear-deviation correlated equilibria, thus advancing rationality notions in game theory.
Contribution
It demonstrates that polynomial-time algorithms can achieve sublinear linear-swap regret in sequential games, extending the scope of rational strategies beyond existing notions.
Findings
Achieves polynomial-time linear-swap regret minimization in sequential games.
Introduces the concept of linear-deviation correlated equilibria.
Proves the existence of robust equilibria under linear deviations.
Abstract
No-regret learners seek to minimize the difference between the loss they cumulated through the actions they played, and the loss they would have cumulated in hindsight had they consistently modified their behavior according to some strategy transformation function. The size of the set of transformations considered by the learner determines a natural notion of rationality. As the set of transformations each learner considers grows, the strategies played by the learners recover more complex game-theoretic equilibria, including correlated equilibria in normal-form games and extensive-form correlated equilibria in extensive-form games. At the extreme, a no-swap-regret agent is one that minimizes regret against the set of all functions from the set of strategies to itself. While it is known that the no-swap-regret condition can be attained efficiently in nonsequential (normal-form) games,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Machine Learning and Algorithms
