Residual Bootstrap Exploration for Bandit Algorithms
Chi-Hua Wang, Yang Yu, Botao Hao, Guang Cheng

TL;DR
This paper introduces ReBoot, a novel residual bootstrap exploration method for bandit algorithms that enhances exploration by data-driven variance inflation, achieving logarithmic regret guarantees and improved empirical performance.
Contribution
The paper presents ReBoot, a new perturbation-based exploration technique that captures distributional properties of errors and boosts exploration, with theoretical regret guarantees and superior empirical results.
Findings
ReBoot achieves logarithmic regret in Gaussian bandits.
ReBoot outperforms Giro and PHE in unbounded reward scenarios.
ReBoot maintains computational efficiency comparable to Thompson sampling.
Abstract
In this paper, we propose a novel perturbation-based exploration method in bandit algorithms with bounded or unbounded rewards, called residual bootstrap exploration (\texttt{ReBoot}). The \texttt{ReBoot} enforces exploration by injecting data-driven randomness through a residual-based perturbation mechanism. This novel mechanism captures the underlying distributional properties of fitting errors, and more importantly boosts exploration to escape from suboptimal solutions (for small sample sizes) by inflating variance level in an \textit{unconventional} way. In theory, with appropriate variance inflation level, \texttt{ReBoot} provably secures instance-dependent logarithmic regret in Gaussian multi-armed bandits. We evaluate the \texttt{ReBoot} in different synthetic multi-armed bandits problems and observe that the \texttt{ReBoot} performs better for unbounded rewards and more robustly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Reinforcement Learning in Robotics · Machine Learning and Algorithms
