Time-Varying Gaussian Process Bandit Optimization

Ilija Bogunovic; Jonathan Scarlett; Volkan Cevher

arXiv:1601.06650·stat.ML·January 26, 2016·35 cites

Time-Varying Gaussian Process Bandit Optimization

Ilija Bogunovic, Jonathan Scarlett, Volkan Cevher

PDF

Open Access

TL;DR

This paper introduces two extensions of Gaussian process bandit algorithms for time-varying reward functions, providing regret bounds and demonstrating improved performance over classical methods on synthetic and real data.

Contribution

The paper proposes R-GP-UCB and TV-GP-UCB algorithms with theoretical regret bounds for non-stationary environments, advancing Bayesian optimization techniques.

Findings

01

TV-GP-UCB outperforms R-GP-UCB in practice.

02

Both algorithms outperform classical GP-UCB.

03

Regret bounds explicitly relate to function variation rate.

Abstract

We consider the sequential Bayesian optimization problem with bandit feedback, adopting a formulation that allows for the reward function to vary with time. We model the reward function using a Gaussian process whose evolution obeys a simple Markov model. We introduce two natural extensions of the classical Gaussian process upper confidence bound (GP-UCB) algorithm. The first, R-GP-UCB, resets GP-UCB at regular intervals. The second, TV-GP-UCB, instead forgets about old data in a smooth fashion. Our main contribution comprises of novel regret bounds for these algorithms, providing an explicit characterization of the trade-off between the time horizon and the rate at which the function varies. We illustrate the performance of the algorithms on both synthetic and real data, and we find the gradual forgetting of TV-GP-UCB to perform favorably compared to the sharp resetting of R-GP-UCB.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms

MethodsGaussian Process