Pure Exploration in Kernel and Neural Bandits
Yinglun Zhu, Dongruo Zhou, Ruoxi Jiang, Quanquan Gu, Rebecca Willett,, Robert Nowak

TL;DR
This paper introduces a novel approach for pure exploration in high-dimensional bandits by adaptively embedding features into lower-dimensional spaces, enabling effective learning in kernel and neural settings with theoretical guarantees.
Contribution
It proposes a new method for pure exploration in high-dimensional bandits that handles model misspecification and extends to kernel and neural representations.
Findings
Sample complexity depends on effective dimension of feature spaces.
Method outperforms existing approaches in synthetic and real datasets.
Handles infinite-dimensional RKHS and neural network approximations.
Abstract
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms. To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space and carefully deal with the induced model misspecification. Our approach is conceptually very different from existing works that can either only handle low-dimensional linear bandits or passively deal with model misspecification. We showcase the application of our approach to two pure exploration settings that were previously under-studied: (1) the reward function belongs to a possibly infinite-dimensional Reproducing Kernel Hilbert Space, and (2) the reward function is nonlinear and can be approximated by neural networks. Our main results provide sample complexity guarantees that only depend on the effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Machine Learning and Data Classification
