Bayesian Bandit Algorithms with Approximate Inference in Stochastic Linear Bandits
Ziyi Huang, Henry Lam, Haofeng Zhang

TL;DR
This paper provides a theoretical analysis of approximate Bayesian inference in stochastic linear bandits, showing that popular algorithms like LinTS and LinBUCB maintain their regret bounds under approximation, with improved rates under certain conditions.
Contribution
It offers the first regret bounds for stochastic linear bandits with bounded approximate inference errors and introduces a new framework for analyzing the impact of approximate Bayesian inference.
Findings
LinTS and LinBUCB preserve their regret rates with larger constants under approximation.
LinBUCB improves regret rate from (d^{3/2}T) to (dT) with well-behaved distributions.
First theoretical regret bounds for approximate inference in stochastic linear bandits.
Abstract
Bayesian bandit algorithms with approximate Bayesian inference have been widely used in real-world applications. Despite the superior practical performance, their theoretical justification is less investigated in the literature, especially for contextual bandit problems. To fill this gap, we propose a theoretical framework to analyze the impact of approximate inference in stochastic linear bandits and conduct frequentist regret analysis on two Bayesian bandit algorithms, Linear Thompson Sampling (LinTS) and the extension of Bayesian Upper Confidence Bound, namely Linear Bayesian Upper Confidence Bound (LinBUCB). We demonstrate that when applied in approximate inference settings, LinTS and LinBUCB can universally preserve their original rates of regret upper bound but with a sacrifice of larger constant terms. These results hold for general Bayesian inference approaches, assuming the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Machine Learning and ELM
