Fooling Algorithms in Non-Stationary Bandits using Belief Inertia
Gal Mendelson, Eyal Tadmor

TL;DR
This paper investigates how belief inertia in algorithms causes linear regret in non-stationary multi-armed bandits, revealing fundamental limits and vulnerabilities of classical strategies under adversarial changes.
Contribution
It introduces a new belief inertia-based approach to analyze worst-case regret in non-stationary bandits, demonstrating how it can be exploited to create adversarial instances.
Findings
Classical algorithms suffer linear regret in non-stationary settings due to belief inertia.
Even algorithms with periodic restarts face linear regret in worst-case scenarios.
Belief inertia can be used to establish sharp lower bounds in non-stationary bandit problems.
Abstract
We study the problem of worst case regret in piecewise stationary multi armed bandits. While the minimax theory for stationary bandits is well established, understanding analogous limits in time-varying settings is challenging. Existing lower bounds rely on what we refer to as infrequent sampling arguments, where long intervals without exploration allow adversarial reward changes that induce large regret. In this paper, we introduce a fundamentally different approach based on a belief inertia argument. Our analysis captures how an algorithm's empirical beliefs, encoded through historical reward averages, create momentum that resists new evidence after a change. We show how this inertia can be exploited to construct adversarial instances that mislead classical algorithms such as Explore Then Commit, epsilon greedy, and UCB, causing them to suffer regret that grows linearly with T and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Game Theory and Applications · Adversarial Robustness in Machine Learning
