Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for   Non-Stationary Bandits

Gourab Ghatak; Hardhik Mohanty; Aniq Ur Rahman

arXiv:2105.14586·stat.ML·October 22, 2021

Kolmogorov-Smirnov Test-Based Actively-Adaptive Thompson Sampling for Non-Stationary Bandits

Gourab Ghatak, Hardhik Mohanty, Aniq Ur Rahman

PDF

TL;DR

This paper introduces TS-KS, a Thompson Sampling algorithm that actively detects change points in non-stationary bandit problems using the Kolmogorov-Smirnov test, leading to improved adaptation and lower regret.

Contribution

The paper proposes a novel KS test-based Thompson Sampling method for non-stationary bandits that detects change points and resets parameters, outperforming existing algorithms.

Findings

01

TS-KS achieves sub-linear regret in non-stationary environments.

02

TS-KS outperforms static TS and other non-stationary bandit algorithms.

03

TS-KS performs comparably to state-of-the-art forecasting methods.

Abstract

We consider the non-stationary multi-armed bandit (MAB) framework and propose a Kolmogorov-Smirnov (KS) test based Thompson Sampling (TS) algorithm named TS-KS, that actively detects change points and resets the TS parameters once a change is detected. In particular, for the two-armed bandit case, we derive bounds on the number of samples of the reward distribution to detect the change once it occurs. Consequently, we show that the proposed algorithm has sub-linear regret. Contrary to existing works, our algorithm is able to detect a change when the underlying reward distribution changes even though the mean reward remains the same. Finally, to test the efficacy of the proposed algorithm, we employ it in two case-studies: i) task-offloading scenario in wireless edge-computing, and ii) portfolio optimization. Our results show that the proposed TS-KS algorithm outperforms not only the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSpatio-temporal stability analysis