Combinatorial Multi-Armed Bandits with Filtered Feedback

James A. Grant; David S. Leslie; Kevin Glazebrook; Roberto Szechtman

arXiv:1705.09605·cs.LG·May 29, 2017·1 cites

Combinatorial Multi-Armed Bandits with Filtered Feedback

James A. Grant, David S. Leslie, Kevin Glazebrook, Roberto Szechtman

PDF

Open Access

TL;DR

This paper introduces a new algorithm for combinatorial multi-armed bandits with filtered semibandit feedback, handling heavy-tailed rewards and providing theoretical regret bounds for search and detection applications.

Contribution

It proposes Robust-F-CUCB, an upper confidence bound algorithm tailored for filtered feedback and heavy-tailed rewards in CMAB problems, with proven logarithmic regret bounds.

Findings

01

Algorithm achieves near-optimal regret bounds.

02

Handles heavy-tailed reward distributions effectively.

03

Applicable to search and detection scenarios with filtered feedback.

Abstract

Motivated by problems in search and detection we present a solution to a Combinatorial Multi-Armed Bandit (CMAB) problem with both heavy-tailed reward distributions and a new class of feedback, filtered semibandit feedback. In a CMAB problem an agent pulls a combination of arms from a set ${1, ..., k}$ in each round, generating random outcomes from probability distributions associated with these arms and receiving an overall reward. Under semibandit feedback it is assumed that the random outcomes generated are all observed. Filtered semibandit feedback allows the outcomes that are observed to be sampled from a second distribution conditioned on the initial random outcomes. This feedback mechanism is valuable as it allows CMAB methods to be applied to sequential search and detection problems where combinatorial actions are made, but the true rewards (number of objects of interest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Optimization and Search Problems