High-dimensional Nonparametric Contextual Bandit Problem

Shogo Iwazaki; Junpei Komiyama; Masaaki Imaizumi

arXiv:2505.14102·stat.ML·May 21, 2025

High-dimensional Nonparametric Contextual Bandit Problem

Shogo Iwazaki, Junpei Komiyama, Masaaki Imaizumi

PDF

Open Access

TL;DR

This paper studies high-dimensional kernelized contextual bandits, proposing assumptions and analyses that enable no-regret learning and lenient regret bounds despite large feature spaces.

Contribution

It introduces stochastic context assumptions and analyzes lenient regret, extending understanding of learning in high-dimensional kernelized bandit problems.

Findings

01

No-regret learning is achievable with growing dimensions under stochastic assumptions.

02

Derived lenient regret rates as a function of the allowed per-round regret .

03

Addresses limitations of Gaussian kernel methods in high-dimensional settings.

Abstract

We consider the kernelized contextual bandit problem with a large feature space. This problem involves $K$ arms, and the goal of the forecaster is to maximize the cumulative rewards through learning the relationship between the contexts and the rewards. It serves as a general framework for various decision-making scenarios, such as personalized online advertising and recommendation systems. Kernelized contextual bandits generalize the linear contextual bandit problem and offers a greater modeling flexibility. Existing methods, when applied to Gaussian kernels, yield a trivial bound of $O (T)$ when we consider $Ω (lo g T)$ feature dimensions. To address this, we introduce stochastic assumptions on the context distribution and show that no-regret learning is achievable even when the number of dimensions grows up to the number of samples. Furthermore, we analyze lenient regret, which…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Bandit Algorithms Research · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms