A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of $\Theta(T^{2/3})$ and its Application to Best-of-Both-Worlds
Taira Tsuchiya, Shinji Ito

TL;DR
This paper introduces a new adaptive learning rate for FTRL tailored to online learning problems with a minimax regret of a0T^{2/3}0, improving regret bounds in various feedback settings and simplifying existing algorithms.
Contribution
It develops a novel adaptive learning rate framework for a0T^{2/3}0 regret problems, enhancing BOBW algorithms with simpler, more effective updates.
Findings
Improves regret bounds for partial monitoring, graph bandits, and multi-armed bandits.
Achieves simultaneous optimality in stochastic and adversarial regimes.
Provides a surprisingly simple learning rate compared to previous methods.
Abstract
Follow-the-Regularized-Leader (FTRL) is a powerful framework for various online learning problems. By designing its regularizer and learning rate to be adaptive to past observations, FTRL is known to work adaptively to various properties of an underlying environment. However, most existing adaptive learning rates are for online learning problems with a minimax regret of for the number of rounds , and there are only a few studies on adaptive learning rates for problems with a minimax regret of , which include several important problems dealing with indirect feedback. To address this limitation, we establish a new adaptive learning rate framework for problems with a minimax regret of . Our learning rate is designed by matching the stability, penalty, and bias terms that naturally appear in regret upper bounds for problems with a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms
