Cautious Optimism: A Meta-Algorithm for Near-Constant Regret in General Games
Ashkan Soleymani, Georgios Piliouras, Gabriele Farina

TL;DR
This paper presents Cautious Optimism, a meta-algorithm that accelerates no-regret learning in general games by adaptively pacing Follow-the-Regularized-Leader, achieving near-optimal regret bounds and improving state-of-the-art guarantees.
Contribution
Introduces Cautious Optimism, a novel meta-algorithm that enhances FTRL-based learning with near-constant regret in diverse game settings without requiring knowledge of other players' utilities.
Findings
Achieves near-optimal $O_T( ext{log } T)$ regret in self-play.
Maintains $O_T( ext{sqrt } T)$ regret in adversarial scenarios.
Improves regret bounds in convex games with exponential dependence on action space dimension.
Abstract
We introduce Cautious Optimism, a framework for substantially faster regularized learning in general games. Cautious Optimism, as a variant of Optimism, adaptively controls the learning pace in a dynamic, non-monotone manner to accelerate no-regret learning dynamics. Cautious Optimism takes as input any instance of Follow-the-Regularized-Leader (FTRL) and outputs an accelerated no-regret learning algorithm (COFTRL) by pacing the underlying FTRL with minimal computational overhead. Importantly, it retains uncoupledness, that is, learners do not need to know other players' utilities. Cautious Optimistic FTRL (COFTRL) achieves near-optimal regret in diverse self-play (mixing and matching regularizers) while preserving the optimal regret in adversarial scenarios. In contrast to prior works (e.g., Syrgkanis et al. [2015], Daskalakis et al. [2021]), our analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
