Online Subset Selection using $\alpha$-Core with no Augmented Regret
Sourav Sahoo, Siddhant Chaudhary, Samrat Mukhopadhyay, and Abhishek, Sinha

TL;DR
This paper introduces SCore, an online subset selection policy based on the $ ext{ extalpha}$-Core concept, providing guarantees for a broad class of reward functions without augmented regret, and extending to optimistic learning scenarios.
Contribution
The paper proposes the SCore policy utilizing a new $ ext{ extalpha}$-Core characterization, offering efficient optimization for many reward functions and extending to optimistic learning with hints.
Findings
SCore achieves strong performance guarantees for submodular rewards.
The $ ext{ extalpha}$-Core characterization generalizes core concepts to online subset selection.
The policy extends to settings with additional untrusted reward hints.
Abstract
We revisit the classic problem of optimal subset selection in the online learning set-up. Assume that the set consists of distinct elements. On the th round, an adversary chooses a monotone reward function that assigns a non-negative reward to each subset of An online policy selects (perhaps randomly) a subset consisting of elements before the reward function for the th round is revealed to the learner. As a consequence of its choice, the policy receives a reward of on the th round. Our goal is to design an online sequential subset selection policy to maximize the expected cumulative reward accumulated over a time horizon. In this connection, we propose an online learning policy called SCore (Subset Selection with Core) that solves the problem for a large class of reward functions. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics
