A Multi-Arm Bandit Approach To Subset Selection Under Constraints
Ayush Deva, Kumar Abhishek, Sujit Gujar

TL;DR
This paper introduces a multi-arm bandit approach for subset selection under quality and cost constraints, proposing algorithms for known and unknown qualities, with theoretical bounds and practical simulations.
Contribution
It presents a novel bandit-based algorithm for subset selection with quality constraints and compares it with an exact ILP solution and a greedy approximation.
Findings
ewalgo\ achieves high-probability quality constraints after au\ rounds.
ewalgo\ incurs an $O( ext{ln } T)$ regret after T rounds.
extbf{ extit{dpss}} extbf{ extit{}} provides exact solutions for known qualities.
Abstract
We explore the class of problems where a central planner needs to select a subset of agents, each with different quality and cost. The planner wants to maximize its utility while ensuring that the average quality of the selected agents is above a certain threshold. When the agents' quality is known, we formulate our problem as an integer linear program (ILP) and propose a deterministic algorithm, namely \dpss\ that provides an exact solution to our ILP. We then consider the setting when the qualities of the agents are unknown. We model this as a Multi-Arm Bandit (MAB) problem and propose \newalgo\ to learn the qualities over multiple rounds. We show that after a certain number of rounds, , \newalgo\ outputs a subset of agents that satisfy the average quality constraint with a high probability. Next, we provide bounds on and prove that after rounds, the algorithm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Machine Learning and Algorithms
