Thompson Sampling in Switching Environments with Bayesian Online Change Point Detection
Joseph Mellor, Jonathan Shapiro

TL;DR
This paper introduces Change-Point Thompson Sampling (CTS), a Bayesian approach for non-stationary multi-armed bandit problems with switching environments, demonstrating superior performance on artificial and real-world data.
Contribution
It develops a family of algorithms that incorporate Bayesian change point detection into Thompson Sampling for non-stationary bandits, covering various switching scenarios and rates.
Findings
CTS outperforms other bandit algorithms on real-world data
Algorithms effectively detect and adapt to environment changes
Empirical results show robustness across artificial and real data
Abstract
Thompson Sampling has recently been shown to be optimal in the Bernoulli Multi-Armed Bandit setting[Kaufmann et al., 2012]. This bandit problem assumes stationary distributions for the rewards. It is often unrealistic to model the real world as a stationary distribution. In this paper we derive and evaluate algorithms using Thompson Sampling for a Switching Multi-Armed Bandit Problem. We propose a Thompson Sampling strategy equipped with a Bayesian change point mechanism to tackle this problem. We develop algorithms for a variety of cases with constant switching rate: when switching occurs all arms change (Global Switching), switching occurs independently for each arm (Per-Arm Switching), when the switching rate is known and when it must be inferred from data. This leads to a family of algorithms we collectively term Change-Point Thompson Sampling (CTS). We show empirical results of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Optimization and Search Problems · Auction Theory and Applications
