Prediction by Random-Walk Perturbation
Luc Devroye, G\'abor Lugosi, Gergely Neu

TL;DR
This paper introduces a novel online prediction algorithm using random walk perturbations, achieving near-optimal regret bounds and minimal prediction switches in both expert and combinatorial settings.
Contribution
It presents a new follow-the-perturbed-leader method with random walk perturbations, improving theoretical guarantees on regret and prediction stability.
Findings
Achieves expected regret of O(sqrt(n log N))
Forecaster changes predictions at most O(sqrt(n log N)) times
Extends analysis to online combinatorial optimization with similar properties
Abstract
We propose a version of the follow-the-perturbed-leader online prediction algorithm in which the cumulative losses are perturbed by independent symmetric random walks. The forecaster is shown to achieve an expected regret of the optimal order O(sqrt(n log N)) where n is the time horizon and N is the number of experts. More importantly, it is shown that the forecaster changes its prediction at most O(sqrt(n log N)) times, in expectation. We also extend the analysis to online combinatorial optimization and show that even in this more general setting, the forecaster rarely switches between experts while having a regret of near-optimal order.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
