Fast Change Identification in Multi-Play Bandits and its Applications in Wireless Networks
Gourab Ghatak

TL;DR
This paper introduces TS-GE, a novel multi-armed bandit algorithm for non-stationary environments that efficiently detects changes in reward distributions, with applications in wireless networks like MEC and IIoT.
Contribution
The paper proposes TS-GE, a new algorithm combining Thompson sampling and group exploration for change detection in multi-armed bandits, scalable to many arms.
Findings
TS-GE outperforms state-of-the-art algorithms in certain regimes.
The algorithm effectively detects changes with low false alarms.
Demonstrated success in wireless network applications.
Abstract
Next-generation wireless services are characterized by a diverse set of requirements, to sustain which, the wireless access points need to probe the users in the network periodically. In this regard, we study a novel multi-armed bandit (MAB) setting that mandates probing all the arms periodically while keeping track of the best current arm in a non-stationary environment. In particular, we develop \texttt{TS-GE} that balances the regret guarantees of classical Thompson sampling (TS) with the broadcast probing (BP) of all the arms simultaneously in order to actively detect a change in the reward distributions. The main innovation in the algorithm is in identifying the changed arm by an optional subroutine called group exploration (GE) that scales as for a armed bandit setting. We characterize the probability of missed detection and the probability of false-alarm in terms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Distributed Sensor Networks and Detection Algorithms · Cognitive Radio Networks and Spectrum Sensing
