Safe Learning Under Irreversible Dynamics via Asking for Help

Benjamin Plaut; Juan Li\'evano-Karim; Hanlin Zhu; Stuart Russell

arXiv:2502.14043·cs.LG·September 17, 2025

Safe Learning Under Irreversible Dynamics via Asking for Help

Benjamin Plaut, Juan Li\'evano-Karim, Hanlin Zhu, Stuart Russell

PDF

Open Access

TL;DR

This paper introduces a safe learning algorithm that leverages mentor assistance and knowledge transfer to achieve low regret in irreversible Markov Decision Processes, enabling effective learning in high-stakes environments.

Contribution

It presents the first formal proof that an agent can learn effectively and safely in irreversible environments by asking for help and transferring knowledge, with sublinear regret and mentor queries.

Findings

01

Regret and mentor queries are both sublinear in the time horizon.

02

The approach applies to any Markov Decision Process, including irreversible ones.

03

Provides a sequence of reductions of independent interest.

Abstract

Most learning algorithms with formal regret guarantees essentially rely on trying all possible behaviors, which is problematic when some errors cannot be recovered from. Instead, we allow the learning agent to ask for help from a mentor and to transfer knowledge between similar states. We show that this combination enables the agent to learn both safely and effectively. Under standard online learning assumptions, we provide an algorithm whose regret and number of mentor queries are both sublinear in the time horizon for any Markov Decision Process (MDP), including MDPs with irreversible dynamics. Our proof involves a sequence of three reductions which may be of independent interest. Conceptually, our result may be the first formal proof that it is possible for an agent to obtain high reward while becoming self-sufficient in an unknown, unbounded, and high-stakes environment without…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOccupational Health and Safety Research · Risk and Safety Analysis