Gaussian Process Classification Bandits
Tatsuya Hayashi, Naoki Ito, Koji Tabata, Atsuyoshi Nakamura, Katsumasa, Fujita, Yoshinori Harada, Tamiki Komatsuzaki

TL;DR
This paper introduces a Gaussian process-based classification bandit framework with novel policies that improve sample efficiency in classifying arms, outperforming existing methods in synthetic and real-world experiments.
Contribution
It develops a new framework and policies, FCB and FTSV, for Gaussian process classification bandits with improved sample complexity bounds and empirical performance.
Findings
FCB has a smaller sample complexity upper bound than existing level set estimation algorithms.
Rate-estimation policies outperform other policies in synthetic function experiments.
FTSV achieves the best performance on a real-world dataset.
Abstract
Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and Data Classification
MethodsFast Attention Via Positive Orthogonal Random Features · Performer · Gaussian Process
