A parameter-free hedging algorithm

Kamalika Chaudhuri; Yoav Freund; Daniel Hsu

arXiv:0903.2851·cs.LG·January 19, 2010·64 cites

A parameter-free hedging algorithm

Kamalika Chaudhuri, Yoav Freund, Daniel Hsu

PDF

Open Access

TL;DR

This paper introduces a new parameter-free algorithm for decision-theoretic online learning that performs well with many actions without needing to tune learning rates, addressing a key practical challenge.

Contribution

It proposes a novel, parameter-free algorithm for DTOL that adapts to large action sets and introduces a new regret measure suited for practical applications.

Findings

01

Achieves competitive regret bounds without parameter tuning

02

Performs well with a large number of actions

03

Matches the performance of tuned algorithms under previous regret measures

Abstract

We study the problem of decision-theoretic online learning (DTOL). Motivated by practical applications, we focus on DTOL when the number of actions is very large. Previous algorithms for learning in this framework have a tunable learning rate parameter, and a barrier to using online-learning in practical applications is that it is not understood how to set this parameter optimally, particularly when the number of actions is large. In this paper, we offer a clean solution by proposing a novel and completely parameter-free algorithm for DTOL. We introduce a new notion of regret, which is more natural for applications with a large number of actions. We show that our algorithm achieves good performance with respect to this new notion of regret; in addition, it also achieves performance close to that of the best bounds achieved by previous algorithms with optimally-tuned parameters,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic processes and financial applications · Advanced Queuing Theory Analysis