Universal Online Convex Optimization with $1$ Projection per Round
Wenhao Yang, Yibo Wang, Peng Zhao, Lijun Zhang

TL;DR
This paper introduces a universal online convex optimization algorithm that requires only one projection per round, achieving optimal regret bounds across various convex function classes by leveraging surrogate losses and expert advice.
Contribution
It develops a novel universal OCO algorithm with a single projection per round, unifying regret guarantees for multiple convex function types using surrogate losses and expert aggregation.
Findings
Achieves optimal regret bounds for convex, exponentially concave, and strongly convex functions.
Develops a surrogate loss framework that simplifies projections and analysis.
Extends to exploit smoothness for small-loss regret in multiple convex function classes.
Abstract
To address the uncertainty in function types, recent progress in online convex optimization (OCO) has spurred the development of universal algorithms that simultaneously attain minimax rates for multiple types of convex functions. However, for a -round online problem, state-of-the-art methods typically conduct projections onto the domain in each round, a process potentially time-consuming with complicated feasible sets. In this paper, inspired by the black-box reduction of Cutkosky and Orabona (2018), we employ a surrogate loss defined over simpler domains to develop universal OCO algorithms that only require projection. Embracing the framework of prediction with expert advice, we maintain a set of experts for each type of functions and aggregate their predictions via a meta-algorithm. The crux of our approach lies in a uniquely designed expert-loss for strongly…
Peer Reviews
Decision·NeurIPS 2024 poster
The technical contributions are solid: this paper makes a strict improvement over previous results. The paper is very well-written, clearly introducing the challenges and the main ideas. Details of the analysis and algorithm are nicely explained.
The contribution seems somewhat incremental to me. The only improvement is a $d$ factor for strongly-convex loss. Such result is nice to know but I'm not sure how significant such it is. In addition, the technical novelty isn't significant either.
Videos
Taxonomy
TopicsOptimization and Search Problems · Advanced Bandit Algorithms Research · Mobile Ad Hoc Networks
MethodsSparse Evolutionary Training
