Logically-Constrained Reinforcement Learning

Mohammadhosein Hasanbeig; Alessandro Abate; and Daniel Kroening

arXiv:1801.08099·cs.LG·February 19, 2019·38 cites

Logically-Constrained Reinforcement Learning

Mohammadhosein Hasanbeig, Alessandro Abate, and Daniel Kroening

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model-free reinforcement learning algorithm that synthesizes policies satisfying linear temporal logic properties in unknown MDPs, improving efficiency by focusing exploration on relevant states and guaranteeing probabilistic satisfaction.

Contribution

It presents the first RL method that incorporates LTL constraints directly into policy synthesis for unknown MDPs, enhancing scalability and performance.

Findings

01

Achieves an order of magnitude reduction in synthesis iterations.

02

Guarantees probabilistic satisfaction of LTL properties if such policies exist.

03

Produces reasonable policies even when the property cannot be satisfied.

Abstract

We present the first model-free Reinforcement Learning (RL) algorithm to synthesise policies for an unknown Markov Decision Process (MDP), such that a linear time property is satisfied. The given temporal property is converted into a Limit Deterministic Buchi Automaton (LDBA) and a robust reward function is defined over the state-action pairs of the MDP according to the resulting LDBA. With this reward function, the policy synthesis procedure is "constrained" by the given specification. These constraints guide the MDP exploration so as to minimize the solution time by only considering the portion of the MDP that is relevant to satisfaction of the LTL property. This improves performance and scalability of the proposed method by avoiding an exhaustive update over the whole state space while the efficiency of standard methods such as dynamic programming is hindered by excessive memory…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

grockious/lcrl
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Formal Methods in Verification · Advanced Control Systems Optimization