Sequential Change Detection for Learning in Piecewise Stationary Bandit Environments
Yu-Han Huang, Venugopal V. Veeravalli

TL;DR
This paper develops and analyzes sequential change detection methods tailored for piecewise stationary bandit environments, balancing detection delay and false alarms, with proven optimality and validated through numerical experiments.
Contribution
It introduces two order-optimal change detection tests for unknown distributions in bandit settings, enhancing regret analysis and practical detection performance.
Findings
The proposed tests are order optimal in the finite horizon setting.
Detection latency grows desirably with false alarm probability.
Numerical results confirm theoretical performance.
Abstract
A finite-horizon variant of the quickest change detection problem is investigated, which is motivated by a change detection problem that arises in piecewise stationary bandits. The goal is to minimize the \emph{latency}, which is smallest threshold such that the probability that the detection delay exceeds the threshold is below a desired low level, while controlling the false alarm probability to a desired low level. When the pre- and post-change distributions are unknown, two tests are proposed as candidate solutions. These tests are shown to attain order optimality in terms of the horizon. Furthermore, the growth in their latencies with respect to the false alarm probability and late detection probability satisfies a property that is desirable in regret analysis for piecewise stationary bandits. Numerical results are provided to validate the theoretical performance results.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Bandit Algorithms Research · Online Learning and Analytics
