Combinatorial Pure Exploration with Bottleneck Reward Function

Yihan Du; Yuko Kuroki; Wei Chen

arXiv:2102.12094·cs.LG·October 27, 2021·1 cites

Combinatorial Pure Exploration with Bottleneck Reward Function

Yihan Du, Yuko Kuroki, Wei Chen

PDF

Open Access 1 Video

TL;DR

This paper investigates the combinatorial pure exploration problem with bottleneck rewards, proposing new algorithms for fixed-confidence and fixed-budget settings that are statistically optimal and empirically superior.

Contribution

It introduces novel algorithms tailored for CPE-B, addressing unique bottleneck challenges and achieving optimal sample complexity and error guarantees.

Findings

01

Proposed algorithms with optimal sample complexity for FC setting.

02

State-of-the-art error guarantees for FB setting.

03

Empirical validation showing superiority over baselines.

Abstract

In this paper, we study the Combinatorial Pure Exploration problem with the Bottleneck reward function (CPE-B) under the fixed-confidence (FC) and fixed-budget (FB) settings. In CPE-B, given a set of base arms and a collection of subsets of base arms (super arms) following a certain combinatorial constraint, a learner sequentially plays a base arm and observes its random reward, with the objective of finding the optimal super arm with the maximum bottleneck value, defined as the minimum expected reward of the base arms contained in the super arm. CPE-B captures a variety of practical scenarios such as network routing in communication networks, and its \emph{unique challenges} fall on how to utilize the bottleneck property to save samples and achieve the statistical optimality. None of the existing CPE studies (most of them assume linear rewards) can be adapted to solve such challenges,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Combinatorial Pure Exploration with Bottleneck Reward Function· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems

MethodsCollaborative Preference Embedding