Good Arm Identification via Bandit Feedback

Hideaki Kano; Junya Honda; Kentaro Sakamaki; Kentaro Matsuura,; Atsuyoshi Nakamura; Masashi Sugiyama

arXiv:1710.06360·stat.ML·February 13, 2018·Mach. Learn.

Good Arm Identification via Bandit Feedback

Hideaki Kano, Junya Honda, Kentaro Sakamaki, Kentaro Matsuura,, Atsuyoshi Nakamura, Masashi Sugiyama

PDF

TL;DR

This paper introduces the good arm identification problem in stochastic bandits, focusing on efficiently finding arms with expected rewards above a threshold, and proposes algorithms with near-optimal sample complexity.

Contribution

It formulates the GAI problem, analyzes its unique confidence dilemma, derives tight lower bounds, and develops algorithms that nearly match these bounds.

Findings

01

Proposed algorithms outperform naive methods in synthetic experiments.

02

Derived tight lower bounds on sample complexity for GAI.

03

Validated effectiveness in clinical trial simulations.

Abstract

We consider a novel stochastic multi-armed bandit problem called {\em good arm identification} (GAI), where a good arm is defined as an arm with expected reward greater than or equal to a given threshold. GAI is a pure-exploration problem that a single agent repeats a process of outputting an arm as soon as it is identified as a good one before confirming the other arms are actually not good. The objective of GAI is to minimize the number of samples for each process. We find that GAI faces a new kind of dilemma, the {\em exploration-exploitation dilemma of confidence}, which is different difficulty from the best arm identification. As a result, an efficient design of algorithms for GAI is quite different from that for the best arm identification. We derive a lower bound on the sample complexity of GAI that is tight up to the logarithmic factor $O (lo g \frac{1}{δ})$ for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.