Optimal Online Learning using Potential Functions
Yoav Freund

TL;DR
This paper investigates potential functions in online learning, demonstrating that certain derivatives lead to Brownian motion strategies and identifying the Normal-Hedge potential as optimal for bounding regret.
Contribution
It introduces a family of potential functions with specific derivative conditions and shows the Normal-Hedge potential yields the tightest regret bounds.
Findings
Brownian motion is the min-max optimal adversary strategy under certain potential functions.
Normal-Hedge potential provides the tightest upper bounds on cumulative regret.
Potential functions with derivatives of order 1-4 are key to the analysis.
Abstract
We study a family of potential functions for online learning. We show that if the potential function has strictly positive derivatives of order 1-4 then the min-max optimal strategy for the adversary is Brownian motion. Using that fact we analyze different potential functions and show that the Normal-Hedge potential provides the tightest upper bounds on the cumulative regret of the top {\epsilon}-percentile.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Quantum Computing Algorithms and Architecture
