A parameter-free hedging algorithm
Kamalika Chaudhuri, Yoav Freund, Daniel Hsu

TL;DR
This paper introduces a new parameter-free algorithm for decision-theoretic online learning that performs well with many actions without needing to tune learning rates, addressing a key practical challenge.
Contribution
It proposes a novel, parameter-free algorithm for DTOL that adapts to large action sets and introduces a new regret measure suited for practical applications.
Findings
Achieves competitive regret bounds without parameter tuning
Performs well with a large number of actions
Matches the performance of tuned algorithms under previous regret measures
Abstract
We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic processes and financial applications · Advanced Queuing Theory Analysis
