Open Problem: Tight Online Confidence Intervals for RKHS Elements
Sattar Vakili, Jonathan Scarlett, Tara Javidi

TL;DR
This paper addresses the challenge of developing tight online confidence intervals for elements in RKHS, which are essential for improving regret bounds in kernel-based online learning algorithms.
Contribution
The paper formalizes the problem of online confidence intervals in RKHS and reviews existing results, highlighting the gap in tight bounds for online kernel methods.
Findings
Existing confidence bounds are not tight, leading to suboptimal regret bounds.
Current regret bounds for kernelized bandit algorithms may not be sublinear.
The paper clarifies the fundamental challenges in achieving tight online confidence intervals in RKHS.
Abstract
Confidence intervals are a crucial building block in the analysis of various online learning problems. The analysis of kernel based bandit and reinforcement learning problems utilize confidence intervals applicable to the elements of a reproducing kernel Hilbert space (RKHS). However, the existing confidence bounds do not appear to be tight, resulting in suboptimal regret bounds. In fact, the existing regret bounds for several kernelized bandit algorithms (e.g., GP-UCB, GP-TS, and their variants) may fail to even be sublinear. It is unclear whether the suboptimal regret bound is a fundamental shortcoming of these algorithms or an artifact of the proof, and the main challenge seems to stem from the online (sequential) nature of the observation points. We formalize the question of online confidence intervals in the RKHS setting and overview the existing results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Cognitive Radio Networks and Spectrum Sensing · Machine Learning and Algorithms
