An Optimization-based Algorithm for Non-stationary Kernel Bandits without Prior Knowledge
Kihyuk Hong, Yuhang Li, Ambuj Tewari

TL;DR
This paper introduces a novel optimization-based algorithm for non-stationary kernel bandits that adapts without prior knowledge of non-stationarity, achieving tighter regret bounds and extending to neural network feature mappings.
Contribution
It presents a new algorithm that adapts to non-stationarity without prior info, with improved regret bounds and a neural network extension using neural tangent kernel theory.
Findings
Tighter dynamic regret bounds than previous methods.
Nearly minimax optimal in non-stationary linear bandits.
Empirical adaptation to varying non-stationarity levels.
Abstract
We propose an algorithm for non-stationary kernel bandits that does not require prior knowledge of the degree of non-stationarity. The algorithm follows randomized strategies obtained by solving optimization problems that balance exploration and exploitation. It adapts to non-stationarity by restarting when a change in the reward function is detected. Our algorithm enjoys a tighter dynamic regret bound than previous work on the non-stationary kernel bandit setting. Moreover, when applied to the non-stationary linear bandit setting by using a linear kernel, our algorithm is nearly minimax optimal, solving an open problem in the non-stationary linear bandit literature. We extend our algorithm to use a neural network for dynamically adapting the feature mapping to observed data. We prove a dynamic regret bound of the extension using the neural tangent kernel theory. We demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Adversarial Robustness in Machine Learning
