An Optimization-based Algorithm for Non-stationary Kernel Bandits   without Prior Knowledge

Kihyuk Hong; Yuhang Li; Ambuj Tewari

arXiv:2205.14775·stat.ML·February 21, 2023

An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge

Kihyuk Hong, Yuhang Li, Ambuj Tewari

PDF

Open Access

TL;DR

This paper introduces a novel optimization-based algorithm for non-stationary kernel bandits that adapts without prior knowledge of non-stationarity, achieving tighter regret bounds and extending to neural network feature mappings.

Contribution

It presents a new algorithm that adapts to non-stationarity without prior info, with improved regret bounds and a neural network extension using neural tangent kernel theory.

Findings

01

Tighter dynamic regret bounds than previous methods.

02

Nearly minimax optimal in non-stationary linear bandits.

03

Empirical adaptation to varying non-stationarity levels.

Abstract

We propose an algorithm for non-stationary kernel bandits that does not require prior knowledge of the degree of non-stationarity. The algorithm follows randomized strategies obtained by solving optimization problems that balance exploration and exploitation. It adapts to non-stationarity by restarting when a change in the reward function is detected. Our algorithm enjoys a tighter dynamic regret bound than previous work on the non-stationary kernel bandit setting. Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature. We extend our algorithm to use a neural network for dynamically adapting the feature mapping to observed data. We prove a dynamic regret bound of the extension using the neural tangent kernel theory. We demonstrate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning