Joint Learning of Policy with Unknown Temporal Constraints for Safe   Reinforcement Learning

Lunet Yifru; Ali Baheri

arXiv:2305.00576·eess.SY·May 2, 2023·1 cites

Joint Learning of Policy with Unknown Temporal Constraints for Safe Reinforcement Learning

Lunet Yifru, Ali Baheri

PDF

Open Access

TL;DR

This paper introduces a framework that simultaneously learns safety constraints and optimal policies in reinforcement learning environments where safety constraints are unknown, ensuring safety and optimality through theoretical guarantees.

Contribution

It presents a novel joint learning framework combining logically-constrained RL with evolutionary algorithms to synthesize STL safety specifications with proven convergence.

Findings

01

Successfully identified safety constraints and policies in grid-world environments.

02

Provided theoretical guarantees for convergence and error bounds.

03

Demonstrated practical effectiveness of the framework.

Abstract

In many real-world applications, safety constraints for reinforcement learning (RL) algorithms are either unknown or not explicitly defined. We propose a framework that concurrently learns safety constraints and optimal RL policies in such environments, supported by theoretical guarantees. Our approach merges a logically-constrained RL algorithm with an evolutionary algorithm to synthesize signal temporal logic (STL) specifications. The framework is underpinned by theorems that establish the convergence of our joint learning process and provide error bounds between the discovered policy and the true optimal policy. We showcased our framework in grid-world environments, successfully identifying both acceptable safety constraints and RL policies while demonstrating the effectiveness of our theorems in practice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Formal Methods in Verification