Cautious Reinforcement Learning with Logical Constraints

Mohammadhosein Hasanbeig; Alessandro Abate; Daniel Kroening

arXiv:2002.12156·cs.LG·March 24, 2020·19 cites

Cautious Reinforcement Learning with Logical Constraints

Mohammadhosein Hasanbeig, Alessandro Abate, Daniel Kroening

PDF

Open Access

TL;DR

This paper introduces an adaptive safe padding approach in reinforcement learning that ensures safety during learning while optimizing control policies to satisfy temporal logic goals, balancing exploration and safety with theoretical guarantees.

Contribution

The paper proposes a novel adaptive safe padding method that guarantees safety and optimality in reinforcement learning with temporal logic constraints, supported by theoretical proofs.

Findings

01

The method effectively balances exploration and safety during learning.

02

Theoretical guarantees on policy optimality and convergence are established.

03

Experimental results demonstrate improved safety and goal satisfaction.

Abstract

This paper presents the concept of an adaptive safe padding that forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process. Policies are synthesised to satisfy a goal, expressed as a temporal logic formula, with maximal probability. Enforcing the RL agent to stay safe during learning might limit the exploration, however we show that the proposed architecture is able to automatically handle the trade-off between efficient progress in exploration (towards goal satisfaction) and ensuring safety. Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm. Experimental results are provided to showcase the performance of the proposed method.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · AI-based Problem Solving and Planning