Gaussian Process Classification Bandits

Tatsuya Hayashi; Naoki Ito; Koji Tabata; Atsuyoshi Nakamura; Katsumasa; Fujita; Yoshinori Harada; Tamiki Komatsuzaki

arXiv:2212.13157·cs.LG·December 27, 2022

Gaussian Process Classification Bandits

Tatsuya Hayashi, Naoki Ito, Koji Tabata, Atsuyoshi Nakamura, Katsumasa, Fujita, Yoshinori Harada, Tamiki Komatsuzaki

PDF

Open Access

TL;DR

This paper introduces a Gaussian process-based classification bandit framework with novel policies that improve sample efficiency in classifying arms, outperforming existing methods in synthetic and real-world experiments.

Contribution

It develops a new framework and policies, FCB and FTSV, for Gaussian process classification bandits with improved sample complexity bounds and empirical performance.

Findings

01

FCB has a smaller sample complexity upper bound than existing level set estimation algorithms.

02

Rate-estimation policies outperform other policies in synthetic function experiments.

03

FTSV achieves the best performance on a real-world dataset.

Abstract

Classification bandits are multi-armed bandit problems whose task is to classify a given set of arms into either positive or negative class depending on whether the rate of the arms with the expected reward of at least h is not less than w for given thresholds h and w. We study a special classification bandit problem in which arms correspond to points x in d-dimensional real space with expected rewards f(x) which are generated according to a Gaussian process prior. We develop a framework algorithm for the problem using various arm selection policies and propose policies called FCB and FTSV. We show a smaller sample complexity upper bound for FCB than that for the existing algorithm of the level set estimation, in which whether f(x) is at least h or not must be decided for every arm's x. Arm selection policies depending on an estimated rate of arms with rewards of at least h are also…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and Data Classification

MethodsFast Attention Via Positive Orthogonal Random Features · Performer · Gaussian Process