Online Learning with Predictable Sequences
Alexander Rakhlin, Karthik Sridharan

TL;DR
This paper introduces online learning algorithms that leverage predictable sequences to improve regret bounds, combining prior knowledge with worst-case guarantees and extending to model selection for better adaptation.
Contribution
It proposes methods that exploit predictable sequences in online learning, achieving tighter bounds and enabling concurrent model selection for improved adaptability.
Findings
Algorithms achieve tighter bounds on benign sequences.
Methods extend to partial and side information settings.
Model selection with predictable processes improves regret guarantees.
Abstract
We present methods for online linear optimization that take advantage of benign (as opposed to worst-case) sequences. Specifically if the sequence encountered by the learner is described well by a known "predictable process", the algorithms presented enjoy tighter bounds as compared to the typical worst case bounds. Additionally, the methods achieve the usual worst-case regret bounds if the sequence is not benign. Our approach can be seen as a way of adding prior knowledge about the sequence within the paradigm of online learning. The setting is shown to encompass partial and side information. Variance and path-length bounds can be seen as particular examples of online learning with simple predictable sequences. We further extend our methods and results to include competing with a set of possible predictable processes (models), that is "learning" the predictable process itself…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Data Stream Mining Techniques
