No-Regret Algorithms for Time-Varying Bayesian Optimization
Xingyu Zhou, Ness Shroff

TL;DR
This paper introduces two algorithms for time-varying Bayesian optimization in RKHS, providing the first frequentist regret guarantees and extending previous results to dynamic environments.
Contribution
It proposes R-GP-UCB and SW-GP-UCB algorithms with regret guarantees for time-varying functions in RKHS, bridging linear bandit and Gaussian process bandit analyses.
Findings
First regret guarantees for dynamic RKHS-based Bayesian optimization
Algorithms recover linear bandit results with linear kernels
Extend analysis to Gaussian process bandit setting
Abstract
In this paper, we consider the time-varying Bayesian optimization problem. The unknown function at each time is assumed to lie in an RKHS (reproducing kernel Hilbert space) with a bounded norm. We adopt the general variation budget model to capture the time-varying environment, and the variation is characterized by the change of the RKHS norm. We adapt the restart and sliding window mechanism to introduce two GP-UCB type algorithms: R-GP-UCB and SW-GP-UCB, respectively. We derive the first (frequentist) regret guarantee on the dynamic regret for both algorithms. Our results not only recover previous linear bandit results when a linear kernel is used, but complement the previous regret analysis of time-varying Gaussian process bandit under a Bayesian-type regularity assumption, i.e., each function is a sample from a Gaussian process.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Gaussian Processes and Bayesian Inference · Advanced Multi-Objective Optimization Algorithms
MethodsGaussian Process
