Flickering Multi-Armed Bandits
Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

TL;DR
This paper introduces Flickering Multi-Armed Bandits (FMAB), modeling decision-making with changing action availability constrained by local neighborhoods, and provides algorithms with regret bounds for such environments.
Contribution
It formalizes FMAB with stochastic graph models, proposes a lazy random walk exploration algorithm, and establishes regret bounds and lower bounds for learning costs under local constraints.
Findings
Proposed a two-phase lazy random walk algorithm for FMAB.
Established high-probability sublinear regret bounds.
Proved near-optimality via information-theoretic lower bounds.
Abstract
We introduce Flickering Multi-Armed Bandits (FMAB) to model sequential decision-making in environments with changing action availability, where accessibility of the next action is restricted to a subset dependent on the agent's current choice. We formalize these constraints through stochastically evolving graphs where actions are limited to local neighborhoods. This mobility-constrained structure imposes a dual challenge: the statistical requirement of information acquisition and the physical overhead of navigation. We analyze FMAB under i.i.d. Erd\H{o}s--R'enyi and Edge-Markovian process, proposing a two-phase lazy random walk algorithm for robust exploration. We establish high-probability sublinear regret bounds and prove near-optimality via a matching information-theoretic lower bound. Our results characterize the intrinsic cost of learning under local-move constraints, complemented…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
