Lenient Regret and Good-Action Identification in Gaussian Process   Bandits

Xu Cai; Selwyn Gomes; Jonathan Scarlett

arXiv:2102.05793·stat.ML·May 27, 2021

Lenient Regret and Good-Action Identification in Gaussian Process Bandits

Xu Cai, Selwyn Gomes, Jonathan Scarlett

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces lenient regret concepts for Gaussian process bandits, providing theoretical bounds and practical algorithms for identifying good actions efficiently under relaxed optimality criteria.

Contribution

It presents new lenient regret notions, upper and lower bounds, and algorithms for good-action identification leveraging threshold knowledge.

Findings

01

Upper bounds on lenient regret for GP-UCB and elimination algorithms

02

Lower bounds on regret independent of algorithms

03

Algorithms for faster good-action identification using threshold info

Abstract

In this paper, we study the problem of Gaussian process (GP) bandits under relaxed optimization criteria stating that any function value above a certain threshold is "good enough". On the theoretical side, we study various {\em lenient regret} notions in which all near-optimal actions incur zero penalty, and provide upper bounds on the lenient regret for GP-UCB and an elimination algorithm, circumventing the usual $O (T)$ term (with time horizon $T$ ) resulting from zooming extremely close towards the function maximum. In addition, we complement these upper bounds with algorithm-independent lower bounds. On the practical side, we consider the problem of finding a single "good action" according to a known pre-specified threshold, and introduce several good-action identification algorithms that exploit knowledge of the threshold. We experimentally find that such algorithms can often…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

caitree/GoodAction
pytorchOfficial

Videos

Lenient Regret and Good-Action Identification in Gaussian Process Bandits· slideslive

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms

MethodsGaussian Process