Safe Continual Reinforcement Learning in Non-stationary Environments
Austin Coursey, Abel Diaz-Gonzalez, Marcos Quinones-Grueiro, Gautam Biswas

TL;DR
This paper investigates safe continual reinforcement learning in non-stationary environments, highlighting the challenges of balancing safety and adaptation, and evaluating existing methods through new benchmarks.
Contribution
It introduces three benchmark environments for safe continual RL and analyzes the trade-offs between safety and catastrophic forgetting.
Findings
Existing methods struggle to maintain safety while adapting to changing dynamics.
Regularization strategies can partially mitigate safety-forgetting trade-offs.
Fundamental tension exists between safety constraints and continual adaptation in non-stationary settings.
Abstract
Reinforcement learning (RL) offers a compelling data-driven paradigm for synthesizing controllers for complex systems when accurate physical models are unavailable; however, most existing control-oriented RL methods assume stationarity and, therefore, struggle in real-world non-stationary deployments where system dynamics and operating conditions can change unexpectedly. Moreover, RL controllers acting in physical environments must satisfy safety constraints throughout their learning and execution phases, rendering transient violations during adaptation unacceptable. Although continual RL and safe RL have each addressed non-stationarity and safety, respectively, their intersection remains comparatively unexplored, motivating the study of safe continual RL algorithms that can adapt over the system's lifetime while preserving safety. In this work, we systematically investigate safe…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
