Safe Continual Reinforcement Learning Methods for Nonstationary Environments. Towards a Survey of the State of the Art

Timofey Tomashevskiy

arXiv:2601.05152·cs.LG·January 9, 2026

Safe Continual Reinforcement Learning Methods for Nonstationary Environments. Towards a Survey of the State of the Art

Timofey Tomashevskiy

PDF

Open Access

TL;DR

This paper surveys the current state of safe continual reinforcement learning in nonstationary environments, discussing theoretical challenges, safety mechanisms, and future directions for reliable online algorithms.

Contribution

It provides a comprehensive taxonomy and categorization of safety constraints and mechanisms in continual online safe reinforcement learning methods.

Findings

01

Categorizes safety constraints for online RL algorithms.

02

Discusses theoretical challenges and open questions.

03

Outlines prospects for reliable safe online learning algorithms.

Abstract

This work provides a state-of-the-art survey of continual safe online reinforcement learning (COSRL) methods. We discuss theoretical aspects, challenges, and open questions in building continual online safe reinforcement learning algorithms. We provide the taxonomy and the details of continual online safe reinforcement learning methods based on the type of safe learning mechanism that takes adaptation to nonstationarity into account. We categorize safety constraints formulation for online reinforcement learning algorithms, and finally, we discuss prospects for creating reliable, safe online learning algorithms. Keywords: safe RL in nonstationary environments, safe continual reinforcement learning under nonstationarity, HM-MDP, NSMDP, POMDP, safe POMDP, constraints for continual learning, safe continual reinforcement learning review, safe continual reinforcement learning survey, safe…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control