Is Prior-Free Black-Box Non-Stationary Reinforcement Learning Feasible?
Argyrios Gerogiannis, Yu-Han Huang, Venugopal V. Veeravalli

TL;DR
This paper investigates the feasibility of prior-free black-box algorithms in non-stationary reinforcement learning, revealing limitations of the MASTER algorithm and proposing more robust change detection methods that outperform it.
Contribution
The paper provides theoretical analysis showing MASTER's limitations and introduces alternative change detection methods that improve performance in non-stationary RL scenarios.
Findings
MASTER's non-stationarity detection often fails in practice.
Quickest change detection methods outperform MASTER in experiments.
Prior knowledge-based random restarting is an effective baseline.
Abstract
We study the problem of Non-Stationary Reinforcement Learning (NS-RL) without prior knowledge about the system's non-stationarity. A state-of-the-art, black-box algorithm, known as MASTER, is considered, with a focus on identifying the conditions under which it can achieve its stated goals. Specifically, we prove that MASTER's non-stationarity detection mechanism is not triggered for practical choices of horizon, leading to performance akin to a random restarting algorithm. Moreover, we show that the regret bound for MASTER, while being order optimal, stays above the worst-case linear regret until unreasonably large values of the horizon. To validate these observations, MASTER is tested for the special case of piecewise stationary multi-armed bandits, along with methods that employ random restarting, and others that use quickest change detection to restart. A simple, order optimal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · EEG and Brain-Computer Interfaces
MethodsFocus
