Online Subset Selection using $\alpha$-Core with no Augmented Regret

Sourav Sahoo; Siddhant Chaudhary; Samrat Mukhopadhyay; and Abhishek; Sinha

arXiv:2209.14222·cs.LG·February 10, 2023

Online Subset Selection using $\alpha$-Core with no Augmented Regret

Sourav Sahoo, Siddhant Chaudhary, Samrat Mukhopadhyay, and Abhishek, Sinha

PDF

Open Access

TL;DR

This paper introduces SCore, an online subset selection policy based on the $ ext{ extalpha}$-Core concept, providing guarantees for a broad class of reward functions without augmented regret, and extending to optimistic learning scenarios.

Contribution

The paper proposes the SCore policy utilizing a new $ ext{ extalpha}$-Core characterization, offering efficient optimization for many reward functions and extending to optimistic learning with hints.

Findings

01

SCore achieves strong performance guarantees for submodular rewards.

02

The $ ext{ extalpha}$-Core characterization generalizes core concepts to online subset selection.

03

The policy extends to settings with additional untrusted reward hints.

Abstract

We revisit the classic problem of optimal subset selection in the online learning set-up. Assume that the set $[N]$ consists of $N$ distinct elements. On the $t$ th round, an adversary chooses a monotone reward function $f_{t} : 2^{[N]} \to R_{+}$ that assigns a non-negative reward to each subset of $[N] .$ An online policy selects (perhaps randomly) a subset $S_{t} \subseteq [N]$ consisting of $k$ elements before the reward function $f_{t}$ for the $t$ th round is revealed to the learner. As a consequence of its choice, the policy receives a reward of $f_{t} (S_{t})$ on the $t$ th round. Our goal is to design an online sequential subset selection policy to maximize the expected cumulative reward accumulated over a time horizon. In this connection, we propose an online learning policy called SCore (Subset Selection with Core) that solves the problem for a large class of reward functions. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Reinforcement Learning in Robotics