A Regret-Variance Trade-Off in Online Learning

Dirk van der Hoeven; Nikita Zhivotovskiy; Nicol\`o Cesa-Bianchi

arXiv:2206.02656·cs.LG·June 7, 2022

A Regret-Variance Trade-Off in Online Learning

Dirk van der Hoeven, Nikita Zhivotovskiy, Nicol\`o Cesa-Bianchi

PDF

Open Access 1 Video

TL;DR

This paper explores the trade-off between regret and variance in online learning with expert advice, proposing algorithms that leverage variance to improve performance and providing theoretical guarantees across various settings.

Contribution

It introduces a novel analysis of the regret-variance trade-off, showing how variance can be exploited to enhance online learning algorithms and achieve optimal bounds.

Findings

01

A variant of EWA can outperform the best expert or achieve bounded regret and variance.

02

Variance can be exploited for early stopping in online to batch conversion.

03

The paper extends results to online linear regression with high-probability guarantees.

Abstract

We consider prediction with expert advice for strongly convex and bounded losses, and investigate trade-offs between regret and "variance" (i.e., squared difference of learner's predictions and best expert predictions). With $K$ experts, the Exponentially Weighted Average (EWA) algorithm is known to achieve $O (lo g K)$ regret. We prove that a variant of EWA either achieves a negative regret (i.e., the algorithm outperforms the best expert), or guarantees a $O (lo g K)$ bound on both variance and regret. Building on this result, we show several examples of how variance of predictions can be exploited in learning. In the online to batch analysis, we show that a large empirical variance allows to stop the online to batch conversion early and outperform the risk of the best predictor in the class. We also recover the optimal rate of model selection aggregation when we do not consider early…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Regret-Variance Trade-Off in Online Learning· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Mobile Crowdsensing and Crowdsourcing