Minimax Optimal Algorithms for Unconstrained Linear Optimization
H. Brendan McMahan

TL;DR
This paper develops minimax-optimal algorithms for online linear optimization without constraints, analyzing their theoretical regret bounds and introducing new algorithms that adapt to soft comparator penalties.
Contribution
It introduces the first minimax-optimal algorithms for unconstrained online linear optimization with soft comparator penalties, providing closed-form game values and practical algorithms.
Findings
Closed-form value of the game with bounded comparison set approaches sqrt(2T/pi).
Unprojected gradient descent achieves optimal value with quadratic penalty.
New algorithms achieve near-optimal regret bounds under soft penalties.
Abstract
We design and analyze minimax-optimal algorithms for online linear optimization games where the player's choice is unconstrained. The player strives to minimize regret, the difference between his loss and the loss of a post-hoc benchmark strategy. The standard benchmark is the loss of the best strategy chosen from a bounded comparator set. When the the comparison set and the adversary's gradients satisfy L_infinity bounds, we give the value of the game in closed form and prove it approaches sqrt(2T/pi) as T -> infinity. Interesting algorithms result when we consider soft constraints on the comparator, rather than restricting it to a bounded set. As a warmup, we analyze the game with a quadratic penalty. The value of this game is exactly T/2, and this value is achieved by perhaps the simplest online algorithm of all: unprojected gradient descent with a constant learning rate. We then…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
