Constrained Best Arm Identification in Grouped Bandits

Sahil Dharod; Malyala Preethi Sravani; Sakshi Heda; Sharayu Moharir

arXiv:2412.08031·cs.LG·December 12, 2024

Constrained Best Arm Identification in Grouped Bandits

Sahil Dharod, Malyala Preethi Sravani, Sakshi Heda, Sharayu Moharir

PDF

Open Access

TL;DR

This paper addresses the challenge of identifying the best feasible arm in grouped bandits with attribute-based constraints, proposing a near-optimal policy with theoretical guarantees and empirical validation.

Contribution

It introduces a new constrained bandit model with attribute-based feasibility, characterizes fundamental performance limits, and develops a near-optimal confidence interval-based policy.

Findings

01

The proposed policy is near-optimal in fixed confidence setting.

02

Analytical guarantees are provided for the policy's performance.

03

Simulations show the policy outperforms modified action elimination methods.

Abstract

We study a grouped bandit setting where each arm comprises multiple independent sub-arms referred to as attributes. Each attribute of each arm has an independent stochastic reward. We impose the constraint that for an arm to be deemed feasible, the mean reward of all its attributes should exceed a specified threshold. The goal is to find the arm with the highest mean reward averaged across attributes among the set of feasible arms in the fixed confidence setting. We first characterize a fundamental limit on the performance of any policy. Following this, we propose a near-optimal confidence interval-based policy to solve this problem and provide analytical guarantees for the policy. We compare the performance of the proposed policy with that of two suitably modified versions of action elimination via simulations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Auction Theory and Applications · Optimization and Search Problems

MethodsSparse Evolutionary Training