Unconstrained Online Linear Learning in Hilbert Spaces: Minimax   Algorithms and Normal Approximations

H. Brendan McMahan; Francesco Orabona

arXiv:1403.0628·cs.LG·May 22, 2014·36 cites

Unconstrained Online Linear Learning in Hilbert Spaces: Minimax Algorithms and Normal Approximations

H. Brendan McMahan, Francesco Orabona

PDF

Open Access

TL;DR

This paper introduces new minimax algorithms for unconstrained online linear learning in Hilbert spaces, achieving near-optimal regret bounds and utilizing Normal approximations for analysis.

Contribution

It develops a novel characterization of minimax algorithms, improves existing results, and proposes algorithms with optimal regret bounds for unknown and known time horizons.

Findings

01

Achieves regret bound of O(U√T log(U√T log^2 T +1)) for unknown T.

02

Provides an optimal regret bound when T is known.

03

Uses Normal approximation as a key analysis tool.

Abstract

We study algorithms for online linear optimization in Hilbert spaces, focusing on the case where the player is unconstrained. We develop a novel characterization of a large class of minimax algorithms, recovering, and even improving, several previous results as immediate corollaries. Moreover, using our tools, we develop an algorithm that provides a regret bound of $O (U T lo g (U T lo g^{2} T + 1))$ , where $U$ is the $L_{2}$ norm of an arbitrary comparator and both $T$ and $U$ are unknown to the player. This bound is optimal up to $lo g lo g T$ terms. When $T$ is known, we derive an algorithm with an optimal regret bound (up to constant factors). For both the known and unknown $T$ case, a Normal approximation to the conditional value of the game proves to be the key analysis tool.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems