Towards Safe Continuing Task Reinforcement Learning

Miguel Calvo-Fullana; Luiz F. O. Chamon; Santiago Paternain

arXiv:2102.12585·cs.LG·February 26, 2021

Towards Safe Continuing Task Reinforcement Learning

Miguel Calvo-Fullana, Luiz F. O. Chamon, Santiago Paternain

PDF

TL;DR

This paper introduces a reinforcement learning algorithm designed for continuous tasks that ensures safety without requiring system restarts, enabling safe exploration in control policies for physical systems.

Contribution

It proposes a novel algorithm that allows safe policy learning in continuing tasks without system re-initialization, addressing a key challenge in safe reinforcement learning.

Findings

01

The algorithm successfully learns safe policies in numerical simulations.

02

It demonstrates effective safe exploration without system restarts.

03

The approach maintains safety constraints during learning process.

Abstract

Safety is a critical feature of controller design for physical systems. When designing control policies, several approaches to guarantee this aspect of autonomy have been proposed, such as robust controllers or control barrier functions. However, these solutions strongly rely on the model of the system being available to the designer. As a parallel development, reinforcement learning provides model-agnostic control solutions but in general, it lacks the theoretical guarantees required for safety. Recent advances show that under mild conditions, control policies can be learned via reinforcement learning, which can be guaranteed to be safe by imposing these requirements as constraints of an optimization problem. However, to transfer from learning safety to learning safely, there are two hurdles that need to be overcome: (i) it has to be possible to learn the policy without having to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.