Logarithmic Regret from Sublinear Hints
Aditya Bhaskara, Ashok Cutkosky, Ravi Kumar, Manish Purohit

TL;DR
This paper demonstrates that in online linear optimization, a logarithmic regret can be achieved with significantly fewer hints than previously thought, using only O(√T) hints instead of at every step.
Contribution
It shows that logarithmic regret is possible with only O(√T) hints, challenging the necessity of hints at every step and establishing lower bounds for hints needed.
Findings
O(log T) regret with O(√T) hints
Lower bound of Ω(√T) hints for better regret
Applications to optimistic regret bounds and abstention
Abstract
We consider the online linear optimization problem, where at every step the algorithm plays a point in the unit ball, and suffers loss for some cost vector that is then revealed to the algorithm. Recent work showed that if an algorithm receives a hint that has non-trivial correlation with before it plays , then it can achieve a regret guarantee of , improving on the bound of in the standard setting. In this work, we study the question of whether an algorithm really requires a hint at every time step. Somewhat surprisingly, we show that an algorithm can obtain regret with just hints under a natural query model; in contrast, we also show that hints cannot guarantee better than regret. We give two applications of our result, to the well-studied…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
