Efficient Online-Bandit Strategies for Minimax Learning Problems
Christophe Roux, Elias Wirth, Sebastian Pokutta, Thomas Kerdreux

TL;DR
This paper introduces efficient online-bandit algorithms for solving convex-linear minimax learning problems, emphasizing the structure of the distribution set and providing convergence guarantees for specific set families.
Contribution
It proposes a framework combining online learning and bandit algorithms tailored to the structure of the distribution set, with convergence guarantees for certain set families.
Findings
Algorithms achieve high-probability convergence to minimax values.
Efficiency depends on the structure of the set ; properties are identified.
Applicable to various learning problems with distributional robustness.
Abstract
Several learning problems involve solving min-max problems, e.g., empirical distributional robust learning or learning with non-standard aggregated losses. More specifically, these problems are convex-linear problems where the minimization is carried out over the model parameters and the maximization over the empirical distribution of the training set indexes, where is the simplex or a subset of it. To design efficient methods, we let an online learning algorithm play against a (combinatorial) bandit algorithm. We argue that the efficiency of such approaches critically depends on the structure of and propose two properties of that facilitate designing efficient algorithms. We focus on a specific family of sets encompassing various learning applications and provide high-probability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
