$k\texttt{-experts}$ -- Online Policies and Fundamental Limits
Samrat Mukhopadhyay, Sourav Sahoo, Abhishek Sinha

TL;DR
This paper introduces the $k$-experts problem, a generalization of the classic expert advice framework, proposing a new framework called SAGE that achieves improved regret bounds and characterizes mistake bounds for stable loss functions.
Contribution
The paper presents SAGE, a novel sampling-based framework for online learning in the $k$-experts setting, providing new regret guarantees and mistake bounds, and establishing tight lower bounds.
Findings
SAGE achieves sublinear regret for a wide class of reward functions.
The paper characterizes mistake bounds for stable loss functions.
A tight regret lower bound is established for a variant of the problem.
Abstract
We introduce the \texttt{k-experts} problem - a generalization of the classic Prediction with Expert's Advice framework. Unlike the classic version, where the learner selects exactly one expert from a pool of experts at each round, in this problem, the learner can select a subset of experts at each round . The reward obtained by the learner at each round is assumed to be a function of the selected experts. The primary objective is to design an online learning policy with a small regret. In this pursuit, we propose (mpled Hed) - a framework for designing efficient online learning policies by leveraging statistical sampling techniques. For a wide class of reward functions, we show that either achieves the first sublinear regret guarantee or improves upon the existing ones. Furthermore, going beyond…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems
