Linear Bandit algorithms using the Bootstrap
Nandan Sudarsanam, Balaraman Ravindran

TL;DR
This paper introduces two bootstrap-based algorithms for linear stochastic bandit problems that do not assume specific noise distributions, demonstrating superior or comparable performance to existing methods in simulation studies.
Contribution
The paper presents novel bootstrap-based algorithms, X-Random and X-Fixed, for linear bandits that operate without noise distribution assumptions, expanding the toolkit for such problems.
Findings
X-Random outperforms baselines in cumulative regret across various noise levels.
X-Fixed performs well with fewer trials, comparable to existing methods.
Proposed methods are effective in simulation with real-system data.
Abstract
This study presents two new algorithms for solving linear stochastic bandit problems. The proposed methods use an approach from non-parametric statistics called bootstrapping to create confidence bounds. This is achieved without making any assumptions about the distribution of noise in the underlying system. We present the X-Random and X-Fixed bootstrap bandits which correspond to the two well-known approaches for conducting bootstraps on models, in the literature. The proposed methods are compared to other popular solutions for linear stochastic bandit problems, namely, OFUL, LinUCB and Thompson Sampling. The comparisons are carried out using a simulation study on a hierarchical probability meta-model, built from published data of experiments, which are run on real systems. The model representing the response surfaces is conceptualized as a Bayesian Network which is presented with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Algorithms · Gaussian Processes and Bayesian Inference
