A Regret-Variance Trade-Off in Online Learning
Dirk van der Hoeven, Nikita Zhivotovskiy, Nicol\`o Cesa-Bianchi

TL;DR
This paper explores the trade-off between regret and variance in online learning with expert advice, proposing algorithms that leverage variance to improve performance and providing theoretical guarantees across various settings.
Contribution
It introduces a novel analysis of the regret-variance trade-off, showing how variance can be exploited to enhance online learning algorithms and achieve optimal bounds.
Findings
A variant of EWA can outperform the best expert or achieve bounded regret and variance.
Variance can be exploited for early stopping in online to batch conversion.
The paper extends results to online linear regression with high-probability guarantees.
Abstract
We consider prediction with expert advice for strongly convex and bounded losses, and investigate trade-offs between regret and "variance" (i.e., squared difference of learner's predictions and best expert predictions). With experts, the Exponentially Weighted Average (EWA) algorithm is known to achieve regret. We prove that a variant of EWA either achieves a negative regret (i.e., the algorithm outperforms the best expert), or guarantees a bound on both variance and regret. Building on this result, we show several examples of how variance of predictions can be exploited in learning. In the online to batch analysis, we show that a large empirical variance allows to stop the online to batch conversion early and outperform the risk of the best predictor in the class. We also recover the optimal rate of model selection aggregation when we do not consider early…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Mobile Crowdsensing and Crowdsourcing
